Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote: > On 11/06/13 21:25, Dodji Seketeli wrote: > >It appeared that gcov.c and input.c both have a static function named > >read_line. I guess we ought to keep just one. > > As a general rule, that's good. Generally yes, but IMHO not in this case, both because of the problems you are mentioning, because it already now needs to do quite different things and because I think in the future the version for caret diagnostics should do much more complicated things (caching of the lines for faster further caret diagnostics). Furthermore, ftell/fseek aren't something that should be used unless strictly necessary. Jakub
Re: [gomp4 1/9] Add missing include.
On Wed, Nov 06, 2013 at 08:42:15PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > libgomp/ > * libgomp_g.h: Include for size_t. I'm surprised why it is needed, because libgomp.h that includes it includes stdlib.h that should provide size_t too. But, including this certainly doesn't hurt and makes the header more self-contained. So, ok for trunk/gomp-4_0-branch. Jakub
Re: [gomp4 2/9] libgomp: Prepare for testcases without -fopenmp.
On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > libgomp/ > * testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to > ALWAYS_CFLAGS. > * testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp. > * testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise. > * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise. > * testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS): > Likewise. Ok for trunk/gomp-4_0-branch. Jakub
Re: [gomp4 3/9] OpenACC: Recognize -fopenacc.
On Wed, Nov 06, 2013 at 08:42:17PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > gcc/c-family/ > * c.opt (fopenacc): New option. > gcc/fortran/ > * lang.opt (fopenacc): New option. > * invoke.texi (-fopenacc): Document it. > * gfortran.h (gfc_option_t): New member. > * options.c (gfc_init_options, gfc_handle_option): Handle it. > gcc/testsuite/ > * lib/target-supports.exp (check_effective_target_fopenacc): New > procedure. > gcc/ > * doc/invoke.texi (-fopenacc): Document it. > * doc/sourcebuild.texi (fopenacc): Document it. > gcc/testsuite/ > * c-c++-common/cpp/openacc-define-1.c: New file. > * c-c++-common/cpp/openacc-define-2.c: Likewise. > * c-c++-common/cpp/openacc-define-3.c: Likewise. > * gfortran.dg/openacc-define-1.f90: Likewise. > * gfortran.dg/openacc-define-2.f90: Likewise. > * gfortran.dg/openacc-define-3.f90: Likewise. Ok for branch. Jakub
Re: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.
On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote: > --- gcc/config/arc/arc.h > +++ gcc/config/arc/arc.h > @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3. If not see > %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} > %{r}\ > %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ > %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ > -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ Perhaps add here |ftree-parallelize-loops=* too? >"%{!shared: \ > - %{mt|pthread:%{fopenmp:-lrt} -lpthread} \ > + %{mt|pthread:%{fopenacc|fopenmp:-lrt} -lpthread} \ Likewise. > %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \ >%{mlp64:-L/usr/lib/hpux64/libp} -lprof} \ > %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \ > diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h > index c217398..0676d74 100644 > --- gcc/config/pa/pa-hpux11.h > +++ gcc/config/pa/pa-hpux11.h > @@ -120,7 +120,7 @@ along with GCC; see the file COPYING3. If not see > #undef LIB_SPEC > #define LIB_SPEC \ >"%{!shared:\ > - %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\ > + %{fopenacc|fopenmp:%{static:-a archive_shared} -lrt %{static:-a > archive}}\ Ditto (and several times further). Otherwise LGTM for branch. Jakub
Re: [gomp4 5/9] OpenACC: preprocessor definition, Fortran integer parameter.
On Wed, Nov 06, 2013 at 08:42:19PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > gcc/c-family/ > * c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC. > gcc/fortran/ > * cpp.c (cpp_define_builtins): Conditionally define _OPENACC. > gcc/testsuite/ > * c-c++-common/cpp/openacc-define-1.c: Test _OPENACC. > * c-c++-common/cpp/openacc-define-2.c: Likewise. > * c-c++-common/cpp/openacc-define-3.c: Likewise. > * gfortran.dg/openacc-define-1.f90: Likewise. > * gfortran.dg/openacc-define-2.f90: Likewise. > * gfortran.dg/openacc-define-3.f90: Likewise. > libgomp/ > * openacc.f90 (openacc_version): New integer parameter. > * openacc_lib.h (openacc_version): Likewise. > * testsuite/libgomp.oacc-fortran/openacc_version-1.f: New file. > * testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise. Ok for branch. Jakub
Re: [gomp4 7/9] OpenACC: Use OpenMP's lowering and expansion passes.
On Wed, Nov 06, 2013 at 08:42:21PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > gcc/ > * gimplify.c (gimplify_body): Consider flag_openacc additionally > to flag_openmp. > * omp-low.c (execute_expand_omp, execute_lower_omp) > (gate_diagnose_omp_blocks): Likewise. > gcc/testsuite/ > * gcc.dg/goacc-gomp/goacc-gomp.exp: New file. > * gcc.dg/goacc/goacc.exp: Likewise. Ok for branch. Jakub
Re: [gomp4 6/9] OpenACC: Infrastructure for builtins.
On Wed, Nov 06, 2013 at 08:42:20PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > gcc/ > * oacc-builtins.def: New file. > * Makefile.in (BUILTINS_DEF): Add oacc-builtins.def. > * builtins.def (DEF_GOACC_BUILTIN): New macro. > Include "oacc-builtins.def". > gcc/fortran/ > * f95-lang.c (DEF_GOACC_BUILTIN): New macro. > Include "../oacc-builtins.def". > libgomp/ > * libgomp.map (GOACC_2.0): New symbol version. Ok for branch. Jakub
Re: [gomp4 8/9] OpenACC: Basic support for #pragma acc in the C front end.
On Wed, Nov 06, 2013 at 08:42:22PM +0100, tho...@codesourcery.com wrote: > From: Thomas Schwinge > > gcc/c-family/ > * c-pragma.c (oacc_pragmas): New array. > (c_pp_lookup_pragma, init_pragma): Handle it. > gcc/ > * doc/invoke.texi (-fopenacc): Update. > > gcc/c/ > * c-parser.c (c_parser_omp_all_clauses): Make a parser error > message suitable for OpenACC, too. > gcc/cp/ > * parser.c (cp_parser_omp_all_clauses): Make a parser error > message suitable for OpenACC, too. Ok for branch. Jakub
Re: [gomp4 9/9] OpenACC: Basic support for #pragma acc parallel.
On Wed, Nov 06, 2013 at 08:53:00PM +0100, Thomas Schwinge wrote: > Forgot to pass the --patience switch to Git, so the diff algorithm > decided to first patch the existing expand_omp_taskreg into > expand_oacc_parallel, and then later re-add expand_omp_taskreg. Here's a > more readable version of that patch, avoiding all that back'n'forth: I've just quickly skimmed this, other than what I've mentioned earlier I haven't found anything totally wrong, so with that thing fixed I guess it is ok for the branch, but my quick skimming was far from a proper review. Jakub
Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification
On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote: > Hmmm, good point. I've moved update_stmt and company to the caller, > and modified the caller to call regimplify_operands only for > GIMPLE_RETURN which is the special case. Can't you (later) handle that without regimplification too? > > Let me know if this is still ok so I can commit. Ok. Jakub
Re: [PATCH] Fix PR58941
On Wed, 6 Nov 2013, Richard Sandiford wrote: > Richard Biener writes: > > --- 599,615 > > > > exp = TREE_OPERAND (exp, 0); > > } > > > > + /* We need to deal with variable arrays ending structures. */ > > + if (seen_variable_array_ref > > + && maxsize != -1 > > + && (!bit_offset.fits_shwi () > > + || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1) > > + || (bit_offset.to_shwi () + maxsize > > + == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)) > > + maxsize = -1; > > I realise this was in the old code too, but should it be "HOST_WIDE_INT" > rather than "signed"? Hmm, indeed. I'll fix it. Thanks, Richard. > Ignore me if not. It just stood out as a bit weird in today's wide-int merge. > > Thanks, > Richard > > > > + > > + done: > > if (!bit_offset.fits_shwi ()) > > { > > *poffset = 0; > > *** get_ref_base_and_extent (tree exp, HOST_ > > *** 614,637 > > > > hbit_offset = bit_offset.to_shwi (); > > > > - /* We need to deal with variable arrays ending structures such as > > -struct { int length; int a[1]; } x; x.a[d] > > -struct { struct { int a; int b; } a[1]; } x; x.a[d].a > > -struct { struct { int a[1]; } a[1]; } x; x.a[0][d], x.a[d][0] > > -struct { int len; union { int a[1]; struct X x; } u; } x; x.u.a[d] > > - where we do not know maxsize for variable index accesses to > > - the array. The simplest way to conservatively deal with this > > - is to punt in the case that offset + maxsize reaches the > > - base type boundary. This needs to include possible trailing padding > > - that is there for alignment purposes. */ > > - > > - if (seen_variable_array_ref > > - && maxsize != -1 > > - && (!host_integerp (TYPE_SIZE (base_type), 1) > > - || (hbit_offset + maxsize > > - == (signed) TREE_INT_CST_LOW (TYPE_SIZE (base_type) > > - maxsize = -1; > > - > > /* In case of a decl or constant base object we can do better. */ > > > > if (DECL_P (exp)) > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Improve VRP assert creation for loops
On Wed, 6 Nov 2013, Jakub Jelinek wrote: > On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote: > > > I'm also curious -- did this code show up in a particular benchmark, if > > > so, > > > which one? > > > > I didn't find this problem from any benchmark, but from another > > concern about loop upper bound estimation. Look at the following code: > > > > int foo(unsigned int n, int r) > > { > > int i; > > if (n > 0) > > if (n < 4) > > { > > do { > > --n; > > r *= 2; > > } while (n > 0); > > } > > return r+n; > > } > > > > > > In order to get the upper bound of the loop in this function, GCC > > traverses conditions n<4 and n>0 separately and tries to get any > > useful information. But as those two conditions cannot be combined > > into one due to this issue (note that n>0 will be transformed into > > n!=0), when GCC sees n<4, it will consider the possibility that n may > > be equal to 0, in which case the upper bound is UINT_MAX. If those two > > conditions can be combined into one, which is n-1<=2, then we can get > > the correct upper bound of the loop. > > I've looked at the above testcase to see why we aren't able to determine > the number of iterations upper bound properly here. > > That doesn't mean your patch is useless, though I must say I'm far from > being convinced it is safe ATM and also it looks like quite ugly special > case (trying to undo a VRP optimization but only one single specific case). > > The first problem is VRP, we end up with: > : > # n_1 = PHI > # r_3 = PHI > # RANGE [0, 4294967295] > n_7 = n_1 + 4294967295; > # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe > r_8 = r_3 * 2; > if (n_7 != 0) > goto ; > else > goto ; > > : > goto ; > - missing range on n_1, extremely conservative range on n_7. With the > attached patch we get instead: > : > # RANGE [1, 3] NONZERO 0x3 > # n_1 = PHI > # r_3 = PHI > # RANGE [0, 2] NONZERO 0x3 > n_7 = n_1 + 4294967295; > # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe > r_8 = r_3 * 2; > if (n_7 != 0) > goto ; > else > goto ; > > : > goto ; > > The issue is that we use live_on_edge to determine if it is desirable > to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter > the loop through the bb with exit edge and thus live of the latch isn't > computed (and, generally the propagation for live ignores dfs back edges > anyway), and because in the above live_on_edge ((5)->(6), n_7) is false, > we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during > iteration we start with n_1 being assumed [1, 3] (that is range of the > assertion from the preceeding conditions on n_5(D)), but in the next > iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the > PHI and thus end up with uselessly wide range because we think the > subtraction can wrap around. This patch improves live_on_edge for > empty latches, by just looking at the PHIs on loop->header from the > latch -> header edge and noting which SSA_NAMEs are used there. > > I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable > various tree optimizations already and want to test unrolling of loops > iterating exactly twice, but with this VRP change VRP is smart enough > to replace the PHI argument on the i IV from > - # i_13 = PHI > + # i_13 = PHI <1(3), 0(2)> > (the loop iterates exactly twice) and RTL unrolling doesn't do the > tested thing in that case. But, this should affect only loops that roll > exactly twice and those realy should be unrolled already far before, so IMHO > it isn't something to try to optimize better at the RTL level. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2013-11-06 Jakub Jelinek > > * tree-vrp.c (live_on_edge): If e->dest is an empty latch > of some loop, but live[e->dest->index] is not computed, look > for SSA_NAMEs used in loop header PHIs from the latch edge. > > * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options. > * gcc.dg/unroll_2.c: Likewise. > * gcc.dg/unroll_3.c: Likewise. > * gcc.dg/unroll_4.c: Likewise. > > --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100 > +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100 > @@ -92,6 +92,42 @@ static sbitmap *live; > static bool > live_on_edge (edge e, tree name) > { > + if (live[e->dest->index] == NULL) > +{ I'd rather have live[] computed "correctly" than the following which I think is a hack ... as you say the issue is ordering (or rather that there isn't an "order" for CFG cycles unless we want to iterate). For loop BB order we explicitely handle the latch, maybe just using a different order than RPO order, with special-casing the latch, makes more sense? Richard. > + /* Handle edges to empty latch blocks. If NAME appears > + in loop header phis on edges
Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level
I very strongly disagree with this. The standard needs to be high than "does it pass the test suite." What we are introducing is a case where the program will behave one way with optimization and another way without it. While, this is always true for timing dependent code, it is pretty rare for math. We should always tread carefully when doing that, and the excuse that it is "cleaner" does not, in my mind, fit the bill. On Nov 7, 2013, at 2:08 AM, Richard Sandiford wrote: > Kenneth Zadeck writes: >> So what is the big plan here? if you remove it here and then do not >> do it in wide int, then it is not going to be truncated. > > Yeah, that is the big plan for trees. Mainline doesn't truncate at the > tree level after: > >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html > > and this patch brings wide-int back in line. (The truncations in this patch > were only ever local to wide-int.) > > If you're wondering why we don't want to truncate at the tree level, > see the second half of: > >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html > > Thanks, > Richard > >> On 11/06/2013 05:10 PM, Richard Sandiford wrote: >>> Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch >>> reverts the changed I'd made to mimic the old behaviour on wide-int. >>> >>> Tested on powerpc64-linux-gnu and by assembly comparison on a range of >>> targets. >>> OK to install? >>> >>> Thanks, >>> Richard >>> >>> >>> Index: gcc/fold-const.c >>> === >>> --- gcc/fold-const.c2013-11-05 13:06:56.985255941 + >>> +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 + >>> @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code, >>> /* It's unclear from the C standard whether shifts can overflow. >>> The following code ignores overflow; perhaps a C standard >>> interpretation ruling is needed. */ >>> -res = wi::rshift (arg1, arg2, sign, >>> - SHIFT_COUNT_TRUNCATED >>> - ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0); >>> +res = wi::rshift (arg1, arg2, sign); >>>else >>> -res = wi::lshift (arg1, arg2, >>> - SHIFT_COUNT_TRUNCATED >>> - ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0); >>> +res = wi::lshift (arg1, arg2); >>>break; >>> >>> case RROTATE_EXPR: >>> Index: gcc/tree-ssa-ccp.c >>> === >>> --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 + >>> +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 + >>> @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code, >>>else >>> code = RSHIFT_EXPR; >>> } >>> - int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0; >>>if (code == RSHIFT_EXPR) >>> { >>> - *mask = wi::rshift (wi::ext (r1mask, width, sgn), >>> - shift, sgn, shift_precision); >>> - *val = wi::rshift (wi::ext (r1val, width, sgn), >>> - shift, sgn, shift_precision); >>> + *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn); >>> + *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn); >>> } >>>else >>> { >>> - *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision), >>> - width, sgn); >>> - *val = wi::ext (wi::lshift (r1val, shift, shift_precision), >>> - width, sgn); >>> + *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn); >>> + *val = wi::ext (wi::lshift (r1val, shift), width, sgn); >>> } >>> } >>> }
Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level
On Thu, 7 Nov 2013, Kenneth Zadeck wrote: > I very strongly disagree with this. The standard needs to be high than "does > it pass the test suite." > > What we are introducing is a case where the program will behave one way > with optimization and another way without it. While, this is always > true for timing dependent code, it is pretty rare for math. We should > always tread carefully when doing that, and the excuse that it is > "cleaner" does not, in my mind, fit the bill. It happens always for code with undefined behavior. See the gazillions of INVALID bugreports that complain that -O0 produces different results than -On. Patch is ok. Thanks, Richard. > On Nov 7, 2013, at 2:08 AM, Richard Sandiford > wrote: > > > Kenneth Zadeck writes: > >> So what is the big plan here? if you remove it here and then do not > >> do it in wide int, then it is not going to be truncated. > > > > Yeah, that is the big plan for trees. Mainline doesn't truncate at the > > tree level after: > > > >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html > > > > and this patch brings wide-int back in line. (The truncations in this patch > > were only ever local to wide-int.) > > > > If you're wondering why we don't want to truncate at the tree level, > > see the second half of: > > > >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html > > > > Thanks, > > Richard > > > >> On 11/06/2013 05:10 PM, Richard Sandiford wrote: > >>> Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch > >>> reverts the changed I'd made to mimic the old behaviour on wide-int. > >>> > >>> Tested on powerpc64-linux-gnu and by assembly comparison on a range of > >>> targets. > >>> OK to install? > >>> > >>> Thanks, > >>> Richard > >>> > >>> > >>> Index: gcc/fold-const.c > >>> === > >>> --- gcc/fold-const.c2013-11-05 13:06:56.985255941 + > >>> +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 + > >>> @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code, > >>> /* It's unclear from the C standard whether shifts can overflow. > >>> The following code ignores overflow; perhaps a C standard > >>> interpretation ruling is needed. */ > >>> -res = wi::rshift (arg1, arg2, sign, > >>> - SHIFT_COUNT_TRUNCATED > >>> - ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0); > >>> +res = wi::rshift (arg1, arg2, sign); > >>>else > >>> -res = wi::lshift (arg1, arg2, > >>> - SHIFT_COUNT_TRUNCATED > >>> - ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0); > >>> +res = wi::lshift (arg1, arg2); > >>>break; > >>> > >>> case RROTATE_EXPR: > >>> Index: gcc/tree-ssa-ccp.c > >>> === > >>> --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 + > >>> +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 + > >>> @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code, > >>>else > >>> code = RSHIFT_EXPR; > >>> } > >>> - int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0; > >>>if (code == RSHIFT_EXPR) > >>> { > >>> - *mask = wi::rshift (wi::ext (r1mask, width, sgn), > >>> - shift, sgn, shift_precision); > >>> - *val = wi::rshift (wi::ext (r1val, width, sgn), > >>> - shift, sgn, shift_precision); > >>> + *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn); > >>> + *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn); > >>> } > >>>else > >>> { > >>> - *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision), > >>> - width, sgn); > >>> - *val = wi::ext (wi::lshift (r1val, shift, shift_precision), > >>> - width, sgn); > >>> + *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn); > >>> + *val = wi::ext (wi::lshift (r1val, shift), width, sgn); > >>> } > >>> } > >>> } > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Improve VRP assert creation for loops
On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote: > > --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100 > > +++ gcc/tree-vrp.c 2013-11-06 09:32:19.205104029 +0100 > > @@ -92,6 +92,42 @@ static sbitmap *live; > > static bool > > live_on_edge (edge e, tree name) > > { > > + if (live[e->dest->index] == NULL) > > +{ > > I'd rather have live[] computed "correctly" than the following > which I think is a hack ... as you say the issue is ordering > (or rather that there isn't an "order" for CFG cycles unless > we want to iterate). For loop BB order we explicitely handle > the latch, maybe just using a different order than > RPO order, with special-casing the latch, makes more sense? But is there any order that would help? Sure, we could say just tweak the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the same loop, but that would only help us in calling find_assert_locations_1 on the latch before other bbs in the loop. But that call does nothing, and we'd need to special case handling of the live for the latch block anyway, just not in live_on_edge, but in find_assert_locations instead. Is that what what you are looking for? Because without that special casing of handling latch edges, you'd need to process the header rather than latch first, and propagate through DFS_BACK edges, but then you couldn't propagate through other edges, nor in this testcase would be the latch live computed when processing the header which has there the condition. Jakub
Re: [PATCH] Improve VRP assert creation for loops
On Thu, 7 Nov 2013, Jakub Jelinek wrote: > On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote: > > > --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100 > > > +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100 > > > @@ -92,6 +92,42 @@ static sbitmap *live; > > > static bool > > > live_on_edge (edge e, tree name) > > > { > > > + if (live[e->dest->index] == NULL) > > > +{ > > > > I'd rather have live[] computed "correctly" than the following > > which I think is a hack ... as you say the issue is ordering > > (or rather that there isn't an "order" for CFG cycles unless > > we want to iterate). For loop BB order we explicitely handle > > the latch, maybe just using a different order than > > RPO order, with special-casing the latch, makes more sense? > > But is there any order that would help? Sure, we could say just tweak > the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the > same loop, but that would only help us in calling find_assert_locations_1 > on the latch before other bbs in the loop. But that call does nothing, > and we'd need to special case handling of the live for the latch block > anyway, just not in live_on_edge, but in find_assert_locations instead. > > Is that what what you are looking for? > > Because without that special casing of handling latch edges, you'd need to > process the header rather than latch first, and propagate through DFS_BACK > edges, but then you couldn't propagate through other edges, nor in this > testcase would be the latch live computed when processing the header which > has there the condition. I'm looking for adjusting of live compute - either by adjusting the algorithm or by special casing the latches. Like for example with the following (untested, needs cleanup - the PHI processing can be factored out from find_assert_locations_1 and re-used): @@ -5904,6 +5898,27 @@ find_assert_locations (void) for (i = 0; i < rpo_cnt; ++i) bb_rpo[rpo[i]] = i; + /* Pre-seed loop latch liveness from loop header PHI nodes. Due to + the order we compute liveness and insert asserts we otherwise + fail to insert asserts into the loop latch. */ + loop_p loop; + loop_iterator li; + FOR_EACH_LOOP (li, loop, 0) +{ + i = loop->latch->index; + live[i] = sbitmap_alloc (num_ssa_names); + bitmap_clear (live[i]); + for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header); + !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + if (virtual_operand_p (gimple_phi_result (phi))) + continue; + for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j) + if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME) + bitmap_set_bit (live[i], SSA_NAME_VERSION (gimple_phi_arg_def (phi, j))); + } +}
Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
I was a bit too hasty in committing this patch, thinking the review was done. I have reverted it yesterday night. Let's try again after a night of sleep. Bernd Edlinger writes: > I still do not see how this is supposed to work: > > If the previous invocation of get_line already had read some > characters of the following line(s), how is that information > recovered? It's lost, you are right. I felt under my radar that the file position indicator adjustment that should be done in get_line after it finds a line delimiter felt through the cracks in the transformation made from the getline of glibc. Hopefully in this version, the file position indicator should be properly set when get_line finishes. So am proposing two patches that I think should be committed in pair. The first one is the one we've been discussing lately; it's amended to make get_line remember update the file position indicator. The second one is to factorize the two read_line (from gcov.c and input.c) and make gcov.c just use one from input.c. Nathan Sidwell writes: > How is input.o linked into gcov? I didn't changed the way input.o is linked into gcov. I just launched the debugger on some of the regression tests of gcov in the suite, to see that the read_line from input.c was being picked up. But now that you are asking, here is what I see from gcc/Makefile.in: input.o is part of OBJS-libcommon: OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o $(OBJS-libcommon) ends up in libcommon.a which itself is part of LIBS (and LIBDEPS) And the gcov executable is linked with LIBS as per: GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ > The former's a compiler proper source file. Furthermore, if it is > linked in, doesn't it drag in a whole bunch of the compiler itself > into gcov? >From the above, what I can say is that input.o was already linked with gcov. But I think it's minimal enough to only drag libcpp and the diagnostic subsystem. > I think that means read_line needs to be broken out into a separate > source file -- or at least one that's already common between gcov > and cc1 (etc). Hopefully this concern should be addressed by the above. > Changelog doesn't match diff. (function names wrong, no mention of gcov) Woops, sorry about that. The second patch attached is an update that should address that. And with the changes in the first patch, this new factorization patch is smaller. Jakub Jelinek writes: > On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote: >> On 11/06/13 21:25, Dodji Seketeli wrote: >> >It appeared that gcov.c and input.c both have a static function named >> >read_line. I guess we ought to keep just one. >> >> As a general rule, that's good. > > Generally yes, but IMHO not in this case, both because of the problems you > are mentioning, because it already now needs to do quite different things > and because I think in the future the version for caret diagnostics should > do much more complicated things (caching of the lines for faster further > caret diagnostics). Furthermore, ftell/fseek aren't something that should > be used unless strictly necessary. OK, let's say I am sending these patches here just for the record so that it doesn't get lost :-) I'll devise another scheme right away to avoid the ftell/seek that you don't like. I agree that, as you proposed at the beginning of the initial thread, the caret diagnostics system should avoid opening/closing one file just to cycle over it to find one line, repeatedly. I just thought that would come as a later patch after we fix this PR specifically. Cheers. >From 75cea3f6883f192e9472fd7905f42d84d875e4e7 Mon Sep 17 00:00:00 2001 From: dodji Date: Wed, 6 Nov 2013 11:33:52 + Subject: [PATCH 1/2] PR preprocessor/58580 - preprocessor goes OOM with warning for zero literals gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. * input.c (get_line): New static function definition. (read_line): Take an additional line_length output parameter to be set to the size of the line. Use the new get_line function do the actual line reading. (location_get_source_line): Take an additional output line_len parameter. Update the use of read_line to pass it the line_len parameter. * diagnostic.c (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0
Re: patch to fix PR58784 (ARM LRA crash)
Hi, I've tested LRA on ARM in several configuration and it occurs that only one regression is observed in Fortran for targets A15 and A9 in hard and soft fp amd Thumb mode enabled by default (ARMv5 is clean). The failing test case is gfortran.dg/realloc_on_assign_11.f90 in -O3 and the error seems to be of the same nature of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58784, in the sens that LRA ICEs in check_rtl on an insn which doesn't satisfy its constraints. Here is the insn : (insn 633 543 656 24 (parallel [ (set (mem/c:SI (plus:SI (reg/f:SI 907) (const_int 8 [0x8])) [4 A.54+8 S4 A64]) (smax:SI (reg:SI 562 [ D.5235 ]) (reg:SI 621 [ D.5235 ]))) (clobber (reg:CC 100 cc)) ]).../gfortran.dg/realloc_on_assign_11.f90:29 121 {*store_minmaxsi} (expr_list:REG_DEAD (reg:SI 621 [ D.5235 ]) (expr_list:REG_UNUSED (reg:CC 100 cc) (nil Thanks, Yvan On 31 October 2013 17:31, Yvan Roux wrote: > (this time in plain text !) > >> Does this mean we can now turn on LRA for ARM state by default and >> start looking at performance issues if any ? > > With the other patch for 58785 and a small one I've to post, we are even > about to turn LRA on by default completely, as the compiler now bootstrap > also in thumb and first testsuite are clean :-) > > I juste want ton pass a full validation before. > > Yvan > > >>> Ramana >>> >>> >>> >>> >>> >>> > 2013-10-30 Vladimir Makarov >>> > >>> > PR target/58784 >>> > * lra.c (check_rtl): Remove address check before LRA work. >>> > >>> > 2013-10-30 Vladimir Makarov >>> > >>> > PR target/58784 >>> > * gcc.target/arm/pr58784.c: New. >>> >
Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables
- warning_at (DECL_SOURCE_LOCATION (p), - OPT_Wunused_but_set_variable, - "variable %qD set but not used", p); + { +if (!TREE_THIS_VOLATILE (p)) + warning_at (DECL_SOURCE_LOCATION (p), +OPT_Wunused_but_set_variable, +"variable %qD set but not used", p); + } I'd prefer else if (DECL_CONTEXT (p) == current_function_decl && !TREE_THIS_VOLATILE (p)) which concises the logic and avoids indent change Thanks, Joey On Thu, Nov 7, 2013 at 3:37 PM, Bin.Cheng wrote: > On Thu, Nov 7, 2013 at 11:53 AM, Mingjie Xing wrote: >> 2013/11/6 Richard Biener : >>> You miss a testcase. >>> >>> Also why should the warning be omitted for unused automatic >>> volatile variables? They cannot be used in any way. >>> >>> Richard. >> >> Thanks. I've updated the patch with a test case. >> >> c/ChangeLog >> >> PR 57258 > > Should be: >PR c/57258 > > Thanks, > bin >> * c-decl.c (pop_scope): Don't emit unused variable warnings for >> accessed volatile variables. >> >> testsuite/ChangeLog >> >> * gcc.dg/Wunused-var-4.c: New test. >> >> Mingjie > > > > -- > Best Regards.
[rl78] Use canonical const_int for one_cmplqi2
This patch just changes a QImode (const_int 255) to (const_int -1), since the canonical form is to sign-extend. As things stand the pattern trips a new assert added on the wide-int branch. Tested by building rl78-elf before and after the patch and making sure that there were no changes in assembly code for gcc.c-torture, gcc.dg and g++dg. OK to install? Thanks, Richard gcc/ * config/rl78/rl78-expand.md (one_cmplqi2): Use (const_int -1) rather than (const_int 255). Index: gcc/config/rl78/rl78-expand.md === --- gcc/config/rl78/rl78-expand.md 2013-10-15 19:24:15.821884272 +0100 +++ gcc/config/rl78/rl78-expand.md 2013-11-07 09:41:50.754226143 + @@ -177,7 +177,7 @@ (define_expand "xorqi3" (define_expand "one_cmplqi2" [(set (match_operand:QI 0 "nonimmediate_operand") (xor:QI (match_operand:QI 1 "general_operand") - (const_int 255))) + (const_int -1))) ] "" "if (rl78_force_nonfar_2 (operands, gen_one_cmplqi2))
Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.
2013/11/6 Ilya Enkovich : > 2013/11/5 Jeff Law : >> On 11/04/13 06:27, Richard Biener wrote: >>> >>> On Thu, Oct 31, 2013 at 10:02 AM, Ilya Enkovich >>> wrote: Hi, Here is a patch which hadles the problem with optimization of BUILT_IN_CHKP_ARG_BND calls. Pointer Bounds Checker expects that argument of this call is a default SSA_NAME of the PARM_DECL whose bounds we want to get. The problem is in optimizations which may replace arg with it's copy or a known value. This patch suppress such modifications. >>> >>> >>> This doesn't seem like a good fix. I suppose you require the same on >>> RTL, that is, have the incoming arg reg coalesced with the use? >>> In that case better set SSA_NAME_OCCURS_IN_ABNORMAL_PHI. >> >> I agree. This seems like the wrong direction. >> >> While I don't like abusing SSA_NAME_OCCURS_IN_ABNORMAL_PHI in this manner, >> setting that flag should give them precisely the behaviour they want. That >> makes me think SSA_NAME_OCCURS_IN_ABNORMAL_PHI is poorly named. But >> addressing that is separate from this patch. >> >> Ilya, can you convert your code to set SSA_NAME_OCCURS_IN_ABNORMAL_PHI and >> drop this patch? > > OK. I'll try SSA_NAME_OCCURS_IN_ABNORMAL_PHI approach and see if it > causes any problems. Seems flag works fine. All my tests pass. No need to make additional changes in verifiers etc. Fix will be included into instrumentation pass patch. This patch is dropped. Thanks, Ilya > > Thanks, > Ilya > >> >> >> Jeff >>
Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)
On Wed, Nov 6, 2013 at 9:56 PM, Andrew MacLeod wrote: > On 11/06/2013 06:31 AM, Richard Biener wrote: >> >> On Wed, Nov 6, 2013 at 12:02 PM, Bernd Schmidt >> wrote: >>> >>> >>> Maybe we need to revisit it? As one of those who were not in favour of >>> the C++ move, can I ask you guys to step back for a moment and think >>> about - what do all of these changes buy us, exactly? Imagine the state >>> at the end, where everything is converted and supposedly the temporary >>> ugliness is gone, what have we gained over the code as it is now? >> >> as_a gains us less runtime checking and more static type checking >> which is good. >> >>> I still think all this effort is misdirected and distracts us from >>> making changes that improve gcc for its users. >> >> That I agree to. Instead of fixing the less than optimal separation / >> boundary >> between frontends and the middle-end, or fixing several other >> long-standing >> issues with GCC we spend a lot of time refactoring things to be C++. >> But that was kind of part of the decision (though I remember that we >> mainly wanted to convert containters and isolated stuff, not gimple >> or trees (I bet that'll be next)). >> >> Of course I don't see contributors of "changes that improve gcc for its >> users" >> now wasting their time with converting code to C++. That conversion >> may slow down those people, but only so much. It'll get more interesting >> with branch maintainance ... >> >> > What does it buy us? I won't comment on the merit of other changes, but I > will reiterate what I'm doing. > > 1 - I think the code restructuring to enable the changes I proposed is > quite worthwhile. Our includes are a mess and prototypes locations are all > over the map. I think this is a worthwhile cleanup that I probably wouldn't > be doing otherwise. True. > 2 - I really believe gimple needs a type system different from front end > trees, that is my primary motivation. I'm tired of jumping through hoops to > do anything slightly different, and I got fed up with it. With a separate > type system for gimple, we can rid ourselves of all the stuff that isn't > related to optimization and codegen... Could we do this with trees? > possibly, but if I'm going to go to the effort of converting the front end > tree types into a new or reduced-type subset, I might as well put that > effort into something that is more appropriate right from the start. I'm not sure. Especially the idea of wrapping everything in a I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me. And I put that into the very same basket of things you now complain about. > 3 - Trees are very polymorphic and as a result, its difficult to tell where > types are being used. It's a feature. Now the type vs. object thing may be something to change (historically the reason they are entangled is that things like a vector and a list also were trees - which is a feature). > By replacing those uses with an actual type, we'll get > static type checking which is another improvement. There is only one way > to move the middle/back end off of types as trees that I can see. Go > through the source base and find all the places which use trees as types, > and replace them with an actual type. This can't be done all at once, its > too big a job... So I propose a plan which allows a new gimple type and > current trees to coexist. Once the conversion is complete we can easily > implement a type system that has the features and traits we want. Note that on 'gimple' there are no types ... the types are on the objects 'gimple' refers to and those are 'tree's. So you'd need to change the 'tree's to have TREE_TYPE not be a 'tree'. Or simply subclass 'tree' so you can distinguish a tree type from a tree non-type statically (a bit awkward with the current union setup). But maybe I'm confusing what you call 'type'. > 4 - I realized that if I am going through the entire code base to find all > the type locations, its very little extra work to also do the same for decls > and expressions. Then we'll have static type checking for those as well. > > 5 - When done we will be significantly closer to having the ability to > create a self contained IL that we can strictly define... That alone seems > a worthwhile goal to me. The IL then becomes appropriate for streaming at > any point in during compilation. This means any front end or tool can > easily generate or consume a gimple stream which opens up numerous > opportunities... That is pretty hard to enable today. Have fun with dealing with our legacy frontends ;) Or didn't you actually mean "any point" during compilation? ... oh, you talk about GIMPLE. So no plans for Frontend ASTs aka GENERIC for our legacy frontends? > Im going to do the initial bits on a branch for the first couple of months, > and during that time convert a number of files over to the new gimple > interface. Then we can look at it, measure it, comment on it, tweak it, > whate
Re: PATCH: PR target/59034: FAIL gcc.c-torture/compile/20031208-1.c
On Wed, Nov 6, 2013 at 11:42 PM, Uros Bizjak wrote: > On Thu, Nov 7, 2013 at 7:19 AM, H.J. Lu wrote: > >> We should use Pmode with stack_pointer_rtx. OK for trunk and 4.8 >> branch? > > OK everywhere. > I checked it into trunk and 4.8 branch. For 4.7, I only checked in the i386.md change since the testcases use the new -maddress-mode= in 4.8. Thanks. -- H.J.
Re: [PATCH, i386]: Fix PR 59021, new vzeroupper instructions generated with -mavx
On Wed, Nov 6, 2013 at 11:06 PM, Eric Botcazou wrote: >> Attached patch fixes PR 59021, where new vzeroupper instructions are >> generated for -mavx after Vlad's useless insn removal patch. >> >> The problem was, that we depent on (useless) moves to AVX256 function >> argument registers in front of the function call to switch the mode to >> DIRTY mode. This is not true anymore, so call_insn has to switch to >> DIRTY mode by itself. > > Thanks for fixing this! OTOH, looking a bit deeper, it looks that there is a problem in mode-switching infrastructure. If we have a BB without any mode requirements, but an insn that switched the mode via MODE_AFTER, we should not mark the block as transparent. Indeed, we even have a N.B. comment in mode-switching.c: /* Check for blocks without ANY mode requirements. N.B. because of MODE_AFTER, last_mode might still be different from no_mode. */ But, we do nothing w.r.t to transparency Attached incremental patch also fixes additional vzeroupper problem. Calls with AVX argument register in fact do not have any mode requirements, so we don't need to fake MODE_DIRTY requirement. I have added Joern to the discussion. Uros. Index: mode-switching.c === --- mode-switching.c(revision 204496) +++ mode-switching.c(working copy) @@ -577,6 +577,8 @@ optimize_mode_switching (void) { ptr = new_seginfo (no_mode, BB_END (bb), bb->index, live_now); add_seginfo (info + bb->index, ptr); + if (last_mode != no_mode) + bitmap_clear_bit (transp[bb->index], j); } } #if defined (MODE_ENTRY) && defined (MODE_EXIT) Index: config/i386/i386.c === --- config/i386/i386.c (revision 204496) +++ config/i386/i386.c (working copy) @@ -15708,7 +15708,7 @@ ix86_avx_u128_mode_needed (rtx insn) rtx arg = XEXP (XEXP (link, 0), 0); if (ix86_check_avx256_register (&arg, NULL)) - return AVX_U128_DIRTY; + return AVX_U128_ANY; } }
Re: Re-factor tree.h - Part 1
On Wed, Nov 6, 2013 at 6:10 PM, Diego Novillo wrote: > On Wed, Nov 6, 2013 at 2:04 AM, Marc Glisse wrote: >> On Tue, 5 Nov 2013, Diego Novillo wrote: >> >>> This is the first patch in a series of patches to cleanup tree.h to >>> reduce the exposure it has all over the compiler. >>> >>> In this patch, I'm moving functions that are used once into the files >>> that use them, and make them private to that file. These functions >>> were declared extern in tree.h and called from exactly one place. >> >> >> I am not a big fan of doing it so automatically. For instance >> widest_int_cst_value should imho remain next to int_cst_value since they >> mostly share the same implementation. Doing this also doesn't promote code >> reuse: if I am looking for a function that does some basic operation on >> trees, I won't only need to look in the file that is semantically relevant, > > Yeah, that was what I was trying to impose. Functions that > semantically make better sense in the place where they're called from. > I filtered functions that were used once but would not make much sense > to move (for instance, the init functions). > > How about this. Below is the list of functions that I took out of > tree.h. They are either not used or used only once, in which case I > moved them (and their private transitive closure) over to the file > that uses them. There are more, but these are the ones that I > originally considered to be too specific to the user file to be > globally exposed. Which ones would you folks keep global? > > I have marked some that we may decide to keep extern and one, which I > would not mind to not move at all. I also marked the functions I > removed and the ones that I just made private to their definition > file: > > Moved to builtins.c: > more_const_call_expr_args_p > expand_stack_restore > expand_stack_save Makes sense. > Moved to cfgexpand.c: > expand_main_function > stack_protect_prologue > expand_asm_stmt > expand_computed_goto > expand_goto > expand_return Likewise. > Moved to cgraphclones.c: > build_function_decl_skip_args Likewise. > Moved to explow.c: > tree_expr_size (maybe keep extern?) Weird enough function, no need to export it. > Moved to expr.c: > fields_length Likewise. > Moved to fold-const.c: > size_low_cst Same as int_cst_value just with lack of any checking? > Moved to gimple-fold.c: > truth_type_for (maybe keep extern?) Uses a langhook so should stay here. > Moved to symtab.c: > decl_assembler_name_hash > decl_assembler_name_equal Ok. > Moved to trans-mem.c: > is_tm_safe_or_pure Ok. > Moved to tree-eh.c: > in_array_bounds_p > range_in_array_bounds_p I'm not sure ... > Moved to tree-ssa-dom.c: > iterative_hash_exprs_commutative I'm not sure - it seems that iterative_hash_expr does exactly the same thing? So instead of moving things this should be refactored (make that function re-usable from iterative_hash_expr). > Moved to tree-ssa-math-opts.c: > widest_int_cst_value (maybe keep extern?) Keep it in tree.c > Moved to tree-vrp.c: > fold_unary_to_constant (?) No. > Moved to cp/call.c > ctor_to_vec No. > Moved to cp/decl.c: > supports_one_only > chain_member No. > Moved to java/class.c: > build_method_type No. > Removed (not used anywhere) > build_type_no_quals > omp_remove_redundant_declare_simd_attrs > fold_build3_initializer_loc > real_twop > print_vec_tree > list_equal_p > ssa_name_nonnegative_p > addr_expr_of_non_mem_decl_p > save_vtable_map_decl > > Made static (only used in its defining file) > stabilize_reference_1 > tree_expr_nonzero_p > tree_invalid_nonnegative_warnv_p > tree_expr_nonzero_warnv_p > fold_builtin_snprintf_chk > validate_arglist > simple_cst_list_equal > lookup_scoped_attribute_spec > get_attribute_namespace > fini_object_sizes Those are obvious. Richard. > > Thanks. Diego.
Re: [wide-int] Make integer_onep return false for signed 1-bit bitfields
On Wed, Nov 6, 2013 at 11:31 PM, Richard Sandiford wrote: > Ping. Ok. Thanks, Richard. > Richard Sandiford writes: >> As discussed on gcc@, integer_onep is supposed to return false for >> a nonzero 1-bit value if the type is signed. >> >> Tested on x86_64-linux-gnu and powerpc64-linux-gnu. OK to install? >> >> Thanks, >> Richard >> >> >> Index: gcc/tree.c >> === >> --- gcc/tree.c2013-10-29 19:19:27.623468618 + >> +++ gcc/tree.c2013-11-02 17:25:06.499657501 + >> @@ -2092,7 +2092,7 @@ integer_onep (const_tree expr) >>switch (TREE_CODE (expr)) >> { >> case INTEGER_CST: >> - return wi::eq_p (expr, 1); >> + return wi::eq_p (wi::to_widest (expr), 1); >> case COMPLEX_CST: >>return (integer_onep (TREE_REALPART (expr)) >> && integer_zerop (TREE_IMAGPART (expr)));
Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart
On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote: > > I decided to name the new file gimple-expr.[ch] instead of > gimple-decl This will eventually split into gimple-type.[ch], > gimple-decl.[ch], and gimple-expr.[ch]. Since we are adding *new* C++ files, can't we please name them *.cc for the implementation part, so at least create gimple-expr.h and gimple-expr.cc but not gimple-expr.c, please! There are some reasons to keep existing *.c files containing C++ code (IMHO the reasons are bad ones, and related to poor habits and to deficiencies in the version control system we have to use, but I really don't want to open that debate again). But for **NEW** files which are definitely in C++, I don't understand why they should be named .c files; this is confusing for all (and, for instance, when compiling them with Clang we are getting -IMHO rightly- some warnings about the file naming). If I remember well, there have been (in the discussion about naming C++ source files of GCC) a suggestion (and perhaps even a consensus), probably by Diego Novillo, to name *.cc our new files which are in C++. Having old C++ files named *.c is already a big frustration (most editors are by default configured to handle them as C, not as C++, files and most developers, notably newbies to GCC contributions, are expecting them to be C files not C++ ones). But IMHO having new source files inside GCC coded in C++ with a *.c file extension is non-sense; such files should have a .cc (or maybe .cpp or .cxx) extension, not a .c extension. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: [PATCH]Fix computation of offset in ivopt
On Wed, Nov 6, 2013 at 6:06 PM, Richard Sandiford wrote: > Hi, > > "bin.cheng" writes: >> Index: gcc/tree-ssa-loop-ivopts.c >> === >> --- gcc/tree-ssa-loop-ivopts.c(revision 203267) >> +++ gcc/tree-ssa-loop-ivopts.c(working copy) >> @@ -2037,12 +2037,12 @@ find_interesting_uses (struct ivopts_data *data) >> >> static tree >> strip_offset_1 (tree expr, bool inside_addr, bool top_compref, >> - unsigned HOST_WIDE_INT *offset) >> + HOST_WIDE_INT *offset) >> { >>tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step; >>enum tree_code code; >>tree type, orig_type = TREE_TYPE (expr); >> - unsigned HOST_WIDE_INT off0, off1, st; >> + HOST_WIDE_INT off0, off1, st; >>tree orig_expr = expr; >> >>STRIP_NOPS (expr); >> @@ -2133,19 +2133,32 @@ strip_offset_1 (tree expr, bool inside_addr, bool >>break; >> >> case COMPONENT_REF: >> - if (!inside_addr) >> - return orig_expr; >> + { >> + tree field; >> >> - tmp = component_ref_field_offset (expr); >> - if (top_compref >> - && cst_and_fits_in_hwi (tmp)) >> - { >> - /* Strip the component reference completely. */ >> - op0 = TREE_OPERAND (expr, 0); >> - op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0); >> - *offset = off0 + int_cst_value (tmp); >> - return op0; >> - } >> + if (!inside_addr) >> + return orig_expr; >> + >> + tmp = component_ref_field_offset (expr); >> + field = TREE_OPERAND (expr, 1); >> + if (top_compref >> + && cst_and_fits_in_hwi (tmp) >> + && cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field))) > > While comparing output for wide-int and mainline, I noticed that > this condition is now always false on x86_64, since DECL_FIELD_BIT_OFFSET > is a 128-bit bitsizetype and since cst_and_fits_in_hwi rejects constants > with precision greater than HWI: > > if (TREE_CODE (x) != INTEGER_CST) > return false; > > if (TYPE_PRECISION (TREE_TYPE (x)) > HOST_BITS_PER_WIDE_INT) > return false; I think that's simply overly restrictive on the side of cst_and_fits_in_hwi. I suppose this function is meant to complement int_cst_value (digging in history might be nice here). > Should this be host_integerp (DECL_FIELD_BIT_OFFSET (field), 0) instead? That's not the same as it rejects -1U. The function seems to ask if the value, if casted to unsigned fits in a HOST_WIDE_INT. So just drop the precision check from cst_and_fits_in_hwi. Richard. > >> + { >> + HOST_WIDE_INT boffset, abs_off; >> + >> + /* Strip the component reference completely. */ >> + op0 = TREE_OPERAND (expr, 0); >> + op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0); >> + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field)); >> + abs_off = abs_hwi (boffset) / BITS_PER_UNIT; >> + if (boffset < 0) >> + abs_off = -abs_off; >> + >> + *offset = off0 + int_cst_value (tmp) + abs_off; >> + return op0; >> + } >> + } >>break; > > Thanks, > Richard
Re: [PATCH] Improve VRP assert creation for loops
On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote: > I'm looking for adjusting of live compute - either by adjusting the > algorithm or by special casing the latches. Like for example > with the following (untested, needs cleanup - the PHI processing > can be factored out from find_assert_locations_1 and re-used): > > @@ -5904,6 +5898,27 @@ find_assert_locations (void) >for (i = 0; i < rpo_cnt; ++i) > bb_rpo[rpo[i]] = i; > > + /* Pre-seed loop latch liveness from loop header PHI nodes. Due to > + the order we compute liveness and insert asserts we otherwise > + fail to insert asserts into the loop latch. */ > + loop_p loop; > + loop_iterator li; > + FOR_EACH_LOOP (li, loop, 0) > +{ > + i = loop->latch->index; > + live[i] = sbitmap_alloc (num_ssa_names); > + bitmap_clear (live[i]); > + for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple phi = gsi_stmt (gsi); > + if (virtual_operand_p (gimple_phi_result (phi))) > + continue; > + for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j) > + if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME) > + bitmap_set_bit (live[i], SSA_NAME_VERSION > (gimple_phi_arg_def (phi, j))); > + } > +} LGTM, except that I think we should only care about phi args from the latch -> header edge, not others and IMHO it is memory waste to allocate sbitmap even when we wouldn't add anything. So, I'm going to rebootstrap/regtest the following (including new testcase Jeff asked for): 2013-11-07 Richard Biener Jakub Jelinek * tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop latches from header PHI arguments from the latch edge. * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options. * gcc.dg/unroll_2.c: Likewise. * gcc.dg/unroll_3.c: Likewise. * gcc.dg/unroll_4.c: Likewise. * gcc.dg/vrp90.c: New test. --- gcc/tree-vrp.c.jj 2013-11-06 16:58:04.675678567 +0100 +++ gcc/tree-vrp.c 2013-11-07 10:59:38.841283931 +0100 @@ -5906,6 +5906,34 @@ find_assert_locations (void) for (i = 0; i < rpo_cnt; ++i) bb_rpo[rpo[i]] = i; + /* Pre-seed loop latch liveness from loop header PHI nodes. Due to + the order we compute liveness and insert asserts we otherwise + fail to insert asserts into the loop latch. */ + loop_p loop; + loop_iterator li; + FOR_EACH_LOOP (li, loop, 0) +{ + i = loop->latch->index; + unsigned int j = find_edge (loop->latch, loop->header)->dest_idx; + for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header); + !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + if (virtual_operand_p (gimple_phi_result (phi))) + continue; + tree arg = gimple_phi_arg_def (phi, j); + if (TREE_CODE (arg) == SSA_NAME) + { + if (live[i] == NULL) + { + live[i] = sbitmap_alloc (num_ssa_names); + bitmap_clear (live[i]); + } + bitmap_set_bit (live[i], SSA_NAME_VERSION (arg)); + } + } +} + need_asserts = false; for (i = rpo_cnt - 1; i >= 0; --i) { --- gcc/testsuite/gcc.dg/unroll_1.c.jj 2013-09-09 11:32:36.0 +0200 +++ gcc/testsuite/gcc.dg/unroll_1.c 2013-11-06 17:10:52.900722932 +0100 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll" } */ +/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll" } */ unsigned a[100], b[100]; inline void bar() --- gcc/testsuite/gcc.dg/unroll_2.c.jj 2013-08-30 14:38:39.0 +0200 +++ gcc/testsuite/gcc.dg/unroll_2.c 2013-11-06 17:10:30.751845504 +0100 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo -fenable-rtl-loop2_unroll" } */ +/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo -fenable-rtl-loop2_unroll" } */ unsigned a[100], b[100]; inline void bar() --- gcc/testsuite/gcc.dg/unroll_3.c.jj 2013-08-30 14:38:39.0 +0200 +++ gcc/testsuite/gcc.dg/unroll_3.c 2013-11-06 17:10:38.864800338 +0100 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo" } */ +/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo" } */ unsigned a[100], b[100]; inli
Re: [PATCH] Improve VRP assert creation for loops
On Thu, 7 Nov 2013, Jakub Jelinek wrote: > On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote: > > I'm looking for adjusting of live compute - either by adjusting the > > algorithm or by special casing the latches. Like for example > > with the following (untested, needs cleanup - the PHI processing > > can be factored out from find_assert_locations_1 and re-used): > > > > @@ -5904,6 +5898,27 @@ find_assert_locations (void) > >for (i = 0; i < rpo_cnt; ++i) > > bb_rpo[rpo[i]] = i; > > > > + /* Pre-seed loop latch liveness from loop header PHI nodes. Due to > > + the order we compute liveness and insert asserts we otherwise > > + fail to insert asserts into the loop latch. */ > > + loop_p loop; > > + loop_iterator li; > > + FOR_EACH_LOOP (li, loop, 0) > > +{ > > + i = loop->latch->index; > > + live[i] = sbitmap_alloc (num_ssa_names); > > + bitmap_clear (live[i]); > > + for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header); > > + !gsi_end_p (gsi); gsi_next (&gsi)) > > + { > > + gimple phi = gsi_stmt (gsi); > > + if (virtual_operand_p (gimple_phi_result (phi))) > > + continue; > > + for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j) > > + if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME) > > + bitmap_set_bit (live[i], SSA_NAME_VERSION > > (gimple_phi_arg_def (phi, j))); > > + } > > +} > > LGTM, except that I think we should only care about phi args from the > latch -> header edge, not others and IMHO it is memory waste to allocate > sbitmap even when we wouldn't add anything. So, I'm going to > rebootstrap/regtest the following (including new testcase Jeff asked for): > > 2013-11-07 Richard Biener > Jakub Jelinek > > * tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop > latches from header PHI arguments from the latch edge. > > * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options. > * gcc.dg/unroll_2.c: Likewise. > * gcc.dg/unroll_3.c: Likewise. > * gcc.dg/unroll_4.c: Likewise. > * gcc.dg/vrp90.c: New test. > > --- gcc/tree-vrp.c.jj 2013-11-06 16:58:04.675678567 +0100 > +++ gcc/tree-vrp.c2013-11-07 10:59:38.841283931 +0100 > @@ -5906,6 +5906,34 @@ find_assert_locations (void) >for (i = 0; i < rpo_cnt; ++i) > bb_rpo[rpo[i]] = i; > > + /* Pre-seed loop latch liveness from loop header PHI nodes. Due to > + the order we compute liveness and insert asserts we otherwise > + fail to insert asserts into the loop latch. */ > + loop_p loop; > + loop_iterator li; > + FOR_EACH_LOOP (li, loop, 0) > +{ > + i = loop->latch->index; > + unsigned int j = find_edge (loop->latch, loop->header)->dest_idx; Actually there is only one edge from latch to header by definition, so single_succ_edge (loop->latch) would be better? Otherwise ok. Thanks, Richard. > + for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header); > +!gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple phi = gsi_stmt (gsi); > + if (virtual_operand_p (gimple_phi_result (phi))) > + continue; > + tree arg = gimple_phi_arg_def (phi, j); > + if (TREE_CODE (arg) == SSA_NAME) > + { > + if (live[i] == NULL) > + { > + live[i] = sbitmap_alloc (num_ssa_names); > + bitmap_clear (live[i]); > + } > + bitmap_set_bit (live[i], SSA_NAME_VERSION (arg)); > + } > + } > +} > + >need_asserts = false; >for (i = rpo_cnt - 1; i >= 0; --i) > { > --- gcc/testsuite/gcc.dg/unroll_1.c.jj2013-09-09 11:32:36.0 > +0200 > +++ gcc/testsuite/gcc.dg/unroll_1.c 2013-11-06 17:10:52.900722932 +0100 > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops > -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll" } */ > +/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops > -fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli > -fenable-rtl-loop2_unroll" } */ > > unsigned a[100], b[100]; > inline void bar() > --- gcc/testsuite/gcc.dg/unroll_2.c.jj2013-08-30 14:38:39.0 > +0200 > +++ gcc/testsuite/gcc.dg/unroll_2.c 2013-11-06 17:10:30.751845504 +0100 > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops > -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo > -fenable-rtl-loop2_unroll" } */ > +/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp > -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo > -fenable-rtl-loop2_unroll" } */ > > unsigned a[100], b[100]; > inline void bar() > --- gcc/testsuite/gcc.dg/unroll_3.c.jj2013-08-30 14:38:39.0 > +0200 > +++ gcc/testsuite/gcc.dg/unroll_3.c 2013-11-06 17:10:
[PATCH] Fix casting in get_ref_base_and_extent
As noticed by Richard. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-11-07 Richard Biener * tree-dfa.c (get_ref_base_and_extent): Fix casting. Index: gcc/tree-dfa.c === --- gcc/tree-dfa.c (revision 204458) +++ gcc/tree-dfa.c (working copy) @@ -569,7 +569,7 @@ get_ref_base_and_extent (tree exp, HOST_ && (!bit_offset.fits_shwi () || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1) || (bit_offset.to_shwi () + maxsize - == (signed) TREE_INT_CST_LOW + == (HOST_WIDE_INT) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)) maxsize = -1; @@ -606,7 +606,8 @@ get_ref_base_and_extent (tree exp, HOST_ && (!bit_offset.fits_shwi () || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1) || (bit_offset.to_shwi () + maxsize - == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)) + == (HOST_WIDE_INT) TREE_INT_CST_LOW + (TYPE_SIZE (TREE_TYPE (exp)) maxsize = -1; done:
[PATCH][2/2] Get TREE_OVERFLOW somewhat under control
This fixes CCP and DOM to not spread TREE_OVERFLOW eventually set from fold. I've chosen to change the lattice updating which is easiest. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-11-07 Richard Biener * tree-ssa-ccp.c (canonicalize_float_value): Rename to ... (canonicalize_value): ... this. Also handle stripping of TREE_OVERFLOW. (get_value, set_lattice_value, get_value_for_expr): Adjust. * gimple-fold.c (canonicalize_constructor_val): Strip TREE_OVERFLOW. * tree-ssa-threadedge.c (set_ssa_name_value): Likewise. Index: gcc/tree-ssa-ccp.c === --- gcc/tree-ssa-ccp.c (revision 204504) +++ gcc/tree-ssa-ccp.c (working copy) @@ -168,7 +168,7 @@ typedef struct prop_value_d prop_value_t static prop_value_t *const_val; static unsigned n_const_val; -static void canonicalize_float_value (prop_value_t *); +static void canonicalize_value (prop_value_t *); static bool ccp_fold_stmt (gimple_stmt_iterator *); /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX. */ @@ -326,7 +326,7 @@ get_value (tree var) if (val->lattice_val == UNINITIALIZED) *val = get_default_value (var); - canonicalize_float_value (val); + canonicalize_value (val); return val; } @@ -378,17 +378,24 @@ set_value_varying (tree var) that HONOR_NANS is false, and we try to change the value of x to 0, causing an ICE. With HONOR_NANS being false, the real appearance of NaN would cause undefined behavior, though, so claiming that y (and x) - are UNDEFINED initially is correct. */ + are UNDEFINED initially is correct. + + For other constants, make sure to drop TREE_OVERFLOW. */ static void -canonicalize_float_value (prop_value_t *val) +canonicalize_value (prop_value_t *val) { enum machine_mode mode; tree type; REAL_VALUE_TYPE d; - if (val->lattice_val != CONSTANT - || TREE_CODE (val->value) != REAL_CST) + if (val->lattice_val != CONSTANT) +return; + + if (TREE_OVERFLOW_P (val->value)) +val->value = drop_tree_overflow (val->value); + + if (TREE_CODE (val->value) != REAL_CST) return; d = TREE_REAL_CST (val->value); @@ -454,7 +461,7 @@ set_lattice_value (tree var, prop_value_ /* We can deal with old UNINITIALIZED values just fine here. */ prop_value_t *old_val = &const_val[SSA_NAME_VERSION (var)]; - canonicalize_float_value (&new_val); + canonicalize_value (&new_val); /* We have to be careful to not go up the bitwise lattice represented by the mask. @@ -569,7 +576,7 @@ get_value_for_expr (tree expr, bool for_ val.lattice_val = CONSTANT; val.value = expr; val.mask = double_int_zero; - canonicalize_float_value (&val); + canonicalize_value (&val); } else if (TREE_CODE (expr) == ADDR_EXPR) val = get_value_from_alignment (expr); Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 204504) +++ gcc/gimple-fold.c (working copy) @@ -205,6 +205,8 @@ canonicalize_constructor_val (tree cval, cval = fold_convert (TREE_TYPE (orig_cval), cval); return cval; } + if (TREE_OVERFLOW_P (cval)) +return drop_tree_overflow (cval); return orig_cval; } Index: gcc/tree-ssa-threadedge.c === --- gcc/tree-ssa-threadedge.c (revision 204504) +++ gcc/tree-ssa-threadedge.c (working copy) @@ -58,6 +58,8 @@ set_ssa_name_value (tree name, tree valu { if (SSA_NAME_VERSION (name) >= ssa_name_values.length ()) ssa_name_values.safe_grow_cleared (SSA_NAME_VERSION (name) + 1); + if (value && TREE_OVERFLOW_P (value)) +value = drop_tree_overflow (value); ssa_name_values[SSA_NAME_VERSION (name)] = value; }
[PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.
Hi, Here is a patch to remove language hook used by Pointer Bounds Checker. To disable checker on non C languages option is moved to c.opt. Thanks, Ilya -- gcc/ 2013-11-06 Ilya Enkovich * common.opt (fcheck-pointer-bounds): Move to ... * c-family/c.opt: ... here. * langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove. (LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED. * langhooks.h (lang_hooks): Remove chkp_supported field. * toplev.c (process_options): Remove chkp_supported check. diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index b862eb9..c082bac 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -850,6 +850,11 @@ fcanonical-system-headers C ObjC C++ ObjC++ Where shorter, use canonicalized paths to systems headers. +fcheck-pointer-bounds +C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds) +Add Pointer Bounds Checker instrumentation. fchkp-* flags are used to +control instrumentation. + fcilkplus C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0) Enable Cilk Plus diff --git a/gcc/common.opt b/gcc/common.opt index 5c2f56e..deeb3f2 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -874,11 +874,6 @@ fbounds-check Common Report Var(flag_bounds_check) Generate code to check bounds before indexing arrays -fcheck-pointer-bounds -Common Report Var(flag_check_pointer_bounds) -Add Pointer Bounds Checker instrumentation. fchkp-* flags are used to -control instrumentation. Currently available for C, C++ and ObjC. - fbranch-count-reg Common Report Var(flag_branch_on_count_reg) Init(1) Optimization Replace add, compare, branch with branch on count register diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h index 0597036..b7be472 100644 --- a/gcc/langhooks-def.h +++ b/gcc/langhooks-def.h @@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree); #define LANG_HOOKS_BLOCK_MAY_FALLTHRU hook_bool_const_tree_true #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP false #define LANG_HOOKS_DEEP_UNSHARING false -#define LANG_HOOKS_CHKP_SUPPORTED false /* Attribute hooks. */ #define LANG_HOOKS_ATTRIBUTE_TABLE NULL @@ -306,8 +305,7 @@ extern void lhd_end_section (void); LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \ LANG_HOOKS_BLOCK_MAY_FALLTHRU, \ LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \ - LANG_HOOKS_DEEP_UNSHARING, \ - LANG_HOOKS_CHKP_SUPPORTED \ + LANG_HOOKS_DEEP_UNSHARING \ } #endif /* GCC_LANG_HOOKS_DEF_H */ diff --git a/gcc/langhooks.h b/gcc/langhooks.h index 002d7eb..a83bf7b 100644 --- a/gcc/langhooks.h +++ b/gcc/langhooks.h @@ -472,9 +472,6 @@ struct lang_hooks gimplification. */ bool deep_unsharing; - /* True if this language allows pointers checker instrumentation. */ - bool chkp_supported; - /* Whenever you add entries here, make sure you adjust langhooks-def.h and langhooks.c accordingly. */ }; diff --git a/gcc/toplev.c b/gcc/toplev.c index 0eaf081..d3dac07 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1286,9 +1286,6 @@ process_options (void) { if (targetm.chkp_bound_mode () == VOIDmode) error ("-fcheck-pointers is not supported for this target"); - - if (!lang_hooks.chkp_supported) - flag_check_pointer_bounds = 0; } /* One region RA really helps to decrease the code size. */
[PATCH][AArch64][committed] Add comment on why plus_constant is not used in aarch64_legitimize_reload_address
Hi all, I've committed this patch as obvious. It adds a comment in aarch64_legitimize_reload_address explaining that we need the RTL structure to be preserved before push_reload and using plus_constant would fold that, causing ICEs. Thanks, Kyrill [gcc/] 2013-11-07 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_legitimize_reload_address): Explain why plus_constant is not used.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ee2cb4c..3fe70c1 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4007,7 +4007,11 @@ aarch64_legitimize_reload_address (rtx *x_p, cst = force_const_mem (xmode, cst); /* Reload high part into base reg, leaving the low part - in the mem instruction. */ + in the mem instruction. + Note that replacing this gen_rtx_PLUS with plus_constant is + wrong in this case because we rely on the + (plus (plus reg c1) c2) structure being preserved so that + XEXP (*p, 0) in push_reload below uses the correct term. */ x = gen_rtx_PLUS (xmode, gen_rtx_PLUS (xmode, XEXP (x, 0), cst), GEN_INT (low));
Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification
Hi, Here is an updated patch version. Thanks, Ilya -- diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 52d9ab0..9d7ae85 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -3030,40 +3030,54 @@ gimple_check_call_args (gimple stmt, tree fndecl, bool args_count_match) { for (i = 0, p = DECL_ARGUMENTS (fndecl); i < nargs; - i++, p = DECL_CHAIN (p)) + i++) { - tree arg; + tree arg = gimple_call_arg (stmt, i); + + /* Skip bound args inserted by Pointer Bounds Checker. */ + if (POINTER_BOUNDS_P (arg)) + continue; + /* We cannot distinguish a varargs function from the case of excess parameters, still deferring the inlining decision to the callee is possible. */ if (!p) break; - arg = gimple_call_arg (stmt, i); + if (p == error_mark_node || arg == error_mark_node || (!types_compatible_p (DECL_ARG_TYPE (p), TREE_TYPE (arg)) && !fold_convertible_p (DECL_ARG_TYPE (p), arg))) return false; + + p = DECL_CHAIN (p); } if (args_count_match && p) return false; } else if (parms) { - for (i = 0, p = parms; i < nargs; i++, p = TREE_CHAIN (p)) + for (i = 0, p = parms; i < nargs; i++) { - tree arg; + tree arg = gimple_call_arg (stmt, i); + + /* Skip bound args inserted by Pointer Bounds Checker. */ + if (POINTER_BOUNDS_P (arg)) + continue; + /* If this is a varargs function defer inlining decision to callee. */ if (!p) break; - arg = gimple_call_arg (stmt, i); + if (TREE_VALUE (p) == error_mark_node || arg == error_mark_node || TREE_CODE (TREE_VALUE (p)) == VOID_TYPE || (!types_compatible_p (TREE_VALUE (p), TREE_TYPE (arg)) && !fold_convertible_p (TREE_VALUE (p), arg))) return false; + + p = TREE_CHAIN (p); } } else diff --git a/gcc/gimple.c b/gcc/gimple.c index 20f6010..0fc704b 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -3043,20 +3043,27 @@ canonicalize_cond_expr_cond (tree t) } /* Build a GIMPLE_CALL identical to STMT but skipping the arguments in - the positions marked by the set ARGS_TO_SKIP. */ + the positions marked by the set ARGS_TO_SKIP. ARGS_TO_SKIP does not + take into account bounds arguments. Bounds passed for skipped args + are also skipped. */ gimple gimple_call_copy_skip_args (gimple stmt, bitmap args_to_skip) { - int i; + int i, bit; int nargs = gimple_call_num_args (stmt); vec vargs; vargs.create (nargs); gimple new_stmt; - for (i = 0; i < nargs; i++) -if (!bitmap_bit_p (args_to_skip, i)) - vargs.quick_push (gimple_call_arg (stmt, i)); + for (i = 0, bit = 0; i < nargs; i++, bit++) + if (POINTER_BOUNDS_P (gimple_call_arg (stmt, i))) + { + if (!bitmap_bit_p (args_to_skip, --bit)) + vargs.quick_push (gimple_call_arg (stmt, i)); + } + else if (!bitmap_bit_p (args_to_skip, bit)) + vargs.quick_push (gimple_call_arg (stmt, i)); if (gimple_call_internal_p (stmt)) new_stmt = gimple_build_call_internal_vec (gimple_call_internal_fn (stmt), @@ -3702,6 +3709,9 @@ validate_call (gimple stmt, tree fndecl) if (!targs) return true; tree arg = gimple_call_arg (stmt, i); + /* Skip bounds. */ + if (POINTER_BOUNDS_P (arg)) + continue; if (INTEGRAL_TYPE_P (TREE_TYPE (arg)) && INTEGRAL_TYPE_P (TREE_VALUE (targs))) ;
Re: [PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.
On Thu, Nov 7, 2013 at 12:46 PM, Ilya Enkovich wrote: > Hi, > > Here is a patch to remove language hook used by Pointer Bounds Checker. To > disable checker on non C languages option is moved to c.opt. Ok. Thanks, Richard. > Thanks, > Ilya > -- > gcc/ > > 2013-11-06 Ilya Enkovich > > * common.opt (fcheck-pointer-bounds): Move to ... > * c-family/c.opt: ... here. > * langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove. > (LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED. > * langhooks.h (lang_hooks): Remove chkp_supported field. > * toplev.c (process_options): Remove chkp_supported check. > > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index b862eb9..c082bac 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -850,6 +850,11 @@ fcanonical-system-headers > C ObjC C++ ObjC++ > Where shorter, use canonicalized paths to systems headers. > > +fcheck-pointer-bounds > +C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds) > +Add Pointer Bounds Checker instrumentation. fchkp-* flags are used to > +control instrumentation. > + > fcilkplus > C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0) > Enable Cilk Plus > diff --git a/gcc/common.opt b/gcc/common.opt > index 5c2f56e..deeb3f2 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -874,11 +874,6 @@ fbounds-check > Common Report Var(flag_bounds_check) > Generate code to check bounds before indexing arrays > > -fcheck-pointer-bounds > -Common Report Var(flag_check_pointer_bounds) > -Add Pointer Bounds Checker instrumentation. fchkp-* flags are used to > -control instrumentation. Currently available for C, C++ and ObjC. > - > fbranch-count-reg > Common Report Var(flag_branch_on_count_reg) Init(1) Optimization > Replace add, compare, branch with branch on count register > diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h > index 0597036..b7be472 100644 > --- a/gcc/langhooks-def.h > +++ b/gcc/langhooks-def.h > @@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree); > #define LANG_HOOKS_BLOCK_MAY_FALLTHRU hook_bool_const_tree_true > #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP false > #define LANG_HOOKS_DEEP_UNSHARING false > -#define LANG_HOOKS_CHKP_SUPPORTED false > > /* Attribute hooks. */ > #define LANG_HOOKS_ATTRIBUTE_TABLE NULL > @@ -306,8 +305,7 @@ extern void lhd_end_section (void); >LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \ >LANG_HOOKS_BLOCK_MAY_FALLTHRU, \ >LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \ > - LANG_HOOKS_DEEP_UNSHARING, \ > - LANG_HOOKS_CHKP_SUPPORTED \ > + LANG_HOOKS_DEEP_UNSHARING \ > } > > #endif /* GCC_LANG_HOOKS_DEF_H */ > diff --git a/gcc/langhooks.h b/gcc/langhooks.h > index 002d7eb..a83bf7b 100644 > --- a/gcc/langhooks.h > +++ b/gcc/langhooks.h > @@ -472,9 +472,6 @@ struct lang_hooks > gimplification. */ >bool deep_unsharing; > > - /* True if this language allows pointers checker instrumentation. */ > - bool chkp_supported; > - >/* Whenever you add entries here, make sure you adjust langhooks-def.h > and langhooks.c accordingly. */ > }; > diff --git a/gcc/toplev.c b/gcc/toplev.c > index 0eaf081..d3dac07 100644 > --- a/gcc/toplev.c > +++ b/gcc/toplev.c > @@ -1286,9 +1286,6 @@ process_options (void) > { >if (targetm.chkp_bound_mode () == VOIDmode) > error ("-fcheck-pointers is not supported for this target"); > - > - if (!lang_hooks.chkp_supported) > - flag_check_pointer_bounds = 0; > } > >/* One region RA really helps to decrease the code size. */
RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
Hi, just some thoughts... fgetc will definitely be much faster than fread 16K and fseek back to the end of line position. Note: fgetc is already buffered and not too slow on average, but only if you do not fseek around. In that case the buffered file-stream data is lost. Well, reading 16K in one piece _could_ be faster than fgetc, but only if you read the data once and do not memcpy or memmove the buffer afterwards. But if you read 16K starting from 0, then read 16K starting from line 1, and so on, you will be really really slow. Did you ever profile your proposed patch? The most easy way to do something for the performance would be in location_get_source_line: If you remember the last file and position, and keep the file open, just in case, that the next time, either the same or a following line is requested. That would be a really simple optimization. Maybe that would be too simple? I do not know. > fseek (fp, prev_pos + cur_len, SEEK_SET); Furthermore this seek will position in the middle of the \r\n sequence on windows. I am furthermore not sure, how the gcov.c would react on a NUL-char in the input stream. Maybe in a comment, somewhere? Could it be possible that this reaches the gcov.c? In that case the same OOM would occur because the memory size is always duplicated after an incomplete line? And gcov.c uses printf ("..%s\n") to print a source listing right? If the line contains NUL-chars, the listing will only contain the part of the line until the NUL-char, which is not what we want? So in that case it would be better to replace NUL-chars in the line buffer with SPACEs, to get a printable C string? What do you think? Isn't a C string always more easy to handle, than this buffer with zero-chars? Regards Bernd.
[Commited] Fix for PR59029
Preapproved by Jakub in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59029 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 4991a3a..535b670 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2013-11-07 Yury Gribov + Jakub Jelinek + + PR sanitizer/59029 + * gcc/asan.c (get_mem_refs_of_builtin_call): Allow + integer literals as addresses in instrumented builtins. + 2013-11-07 H.J. Lu PR target/59034 diff --git a/gcc/asan.c b/gcc/asan.c index fdca377..950d332 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -715,7 +715,7 @@ get_mem_refs_of_builtin_call (const gimple call, instrument_derefs. */ if (TREE_CODE (dest) == ADDR_EXPR) dest = TREE_OPERAND (dest, 0); - else if (TREE_CODE (dest) == SSA_NAME) + else if (TREE_CODE (dest) == SSA_NAME || TREE_CODE (dest) == INTEGER_CST) dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)), dest, build_int_cst (TREE_TYPE (dest), 0)); else diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 56d30a3..f6e735f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2013-11-07 Yury Gribov + Jakub Jelinek + + PR sanitizer/59029 + * c-c++-common/asan/pr59029.c: New test. + 2013-11-07 H.J. Lu PR target/59034 diff --git a/gcc/testsuite/c-c++-common/asan/pr59029.c b/gcc/testsuite/c-c++-common/asan/pr59029.c new file mode 100644 index 000..a1319b2 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/pr59029.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ + +int +foo () +{ + return __sync_fetch_and_add ((int *) 0, 1); +}
Re: [Commited] Fix for PR59029
On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote: > + PR sanitizer/59029 > + * gcc/asan.c (get_mem_refs_of_builtin_call): Allow > + integer literals as addresses in instrumented builtins. > + The prefix gcc/ hasn't been dropped as it should. Marek
[ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath
Hello, I've looked into RTL after register allocation. Insn which lead to assert is: (insn 77 180 79 4 (set (reg/v:TF 44 loc12 [orig:359 T8 ] [359]) (mem:TF (post_modify:DI (reg/f:DI 14 r14 [435]) (plus:DI (reg/f:DI 14 r14 [435]) (reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+0 S16 A128])) 1.i:2370 131 {*movtf_intern\ al} When trying to split it into two DI loads here: ia64_split_tmode_move We check if source and destination RTX are register overlapped, and if so, set flag that says that load adress is dead after insns: if (GET_CODE (operands[1]) == MEM && reg_overlap_mentioned_p (operands[0], operands[1])) ... dead = true; This flags is asserted (to be unset) later in splitting of memory operand in ia64_split_tmode: case POST_MODIFY: gcc_assert (!reversed && !dead); I think that the insn (although op[0] and op1[1] are overlapped by regs) is still safe to split. And we might end up with something like this: (insn 272 0 0 (set (reg:DI 44 loc12) (mem:DI (post_inc:DI (reg/f:DI 14 r14 [435])) [2 *R1_102+0 S8 A128])) (insn 273 272 0 (set (reg:DI 45 loc13) (mem:DI (post_modify:DI (reg/f:DI 14 r14 [435]) (plus:DI (reg/f:DI 14 r14 [435]) (reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+8 S8 A64])) (set (reg/f:DI 14 r14 [435]) (plus:DI (reg/f:DI 14 r14 [435]) (const_int -8 [0xfff8]))) As far as I understand, second insn will update loc13 *after* ld is executed, so value of location in r14 will be correct. Patch in the bottom. ChangeLog entry: * config/ia64/ia64.c (ia64_split_tmode_move): Relax `dead' flag setting. Testcase builds w/o ICE with the patch. Bootstrapped. Could you pls take a look? -- Thanks, K --- gcc/config/ia64/ia64.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index a128b19..446ee59 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -1527,12 +1527,19 @@ ia64_split_tmode_move (rtx operands[]) && reg_overlap_mentioned_p (operands[0], operands[1])) { rtx base = XEXP (operands[1], 0); + rtx first_write = gen_rtx_REG (DImode, REGNO (operands[0])); while (GET_CODE (base) != REG) base = XEXP (base, 0); if (REGNO (base) == REGNO (operands[0])) - reversed = true; - dead = true; + { + reversed = true; + first_write = gen_rtx_REG (DImode, REGNO (operands[0]) + 1); + } + + if (GET_CODE (operands[0]) == REG + && reg_overlap_mentioned_p (first_write, operands[1])) + dead = true; } /* Another reason to do the moves in reversed order is if the first element of the target register pair is also the second element of
Re: [Commited] Fix for PR59029
Yup, Jakub already pointed this out. Should I fix the changelog with another commit? --- From: Marek Polacek Sent: Thursday, November 07, 2013 4:38PM To: Yury Gribov Cc: gcc-patches@gcc.gnu.org, Jakub Jelinek , reich...@gcc.gnu.org, Viacheslav Garbuzov , Evgeny Gavrin Subject: Re: [Commited] Fix for PR59029 On 11/07/2013 04:38 PM, Marek Polacek wrote: On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote: + PR sanitizer/59029 + * gcc/asan.c (get_mem_refs_of_builtin_call): Allow + integer literals as addresses in instrumented builtins. + The prefix gcc/ hasn't been dropped as it should. Marek
Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote: > just some thoughts... > > > fgetc will definitely be much faster than fread 16K and fseek back to the end > of line position. > > Note: fgetc is already buffered and not too slow on average, but only if you > do not > fseek around. In that case the buffered file-stream data is lost. fgetc (unless you are talking about fgetc_unlocked which is somewhat better) is actually very expensive. Of course, fseek is highly undersirable, but I've already said that. Jakub
Re: [Commited] Fix for PR59029
On Thu, Nov 07, 2013 at 04:48:10PM +0400, Yury Gribov wrote: > Yup, Jakub already pointed this out. Should I fix the changelog with > another commit? Yes, just fix up the ChangeLog and commit, for ChangeLog changes of course you don't add a new ChangeLog entry. Jakub
Re: [Commited] Fix for PR59029
>> Should I fix the changelog with another commit? > Yes, just fix up the ChangeLog and commit Fixed. -Y
RE: [PATCH, PR 57748] Check for out of bounds access, Part 2
Hi, On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote: > > On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger > wrote: >> Hi, >> >>> Eh ... even >>> >>> register struct { int i; int a[0]; } asm ("ebx"); >>> >>> works. Also with int a[1] but not with a[2]. So just handling trailing >>> arrays makes this case regress as well. >>> >>> Now I'd call this use questionable ... but we've likely supported that >>> for decades and cannot change that now. >>> >>> Back to fixing everything in expand. >>> >>> Richard. >>> >> >> Ok, finally you asked for it. >> >> Here is my previous version of that patch again. >> >> I have now added a new value "EXPAND_REFERENCE" to the expand_modifier >> enumeration. It is almost like EXPAND_MEMORY but it does not interfere with >> constant values. >> >> I have done the same modification to VIEW_CONVERT_EXPR too, because >> this is a possible inner reference, itself. It is however inherently hard to >> test around this code. >> >> To understand this patch it is good to know what type of object the >> return value "tem" of get_inner_reference can be. >> >> From the program logic at get_inner_reference it is clear that the >> return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF, >> ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may >> be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably >> further restricted because exp is gimplified. >> >> Usually the result will be a MEM_REF or a SSA_NAME of the memory where >> the structure is to be found. >> >> When you look at where EXPAND_MEMORY is handled you see it is special-cased >> in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, >> ARRAY_RANGE_REF. >> >> At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the >> same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given: >> If it is an unaligned memory, we just return the unaligned reference. >> >> This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test >> case, >> because it is only a problem for STRICT_ALIGNMENT targets, and even there it >> will >> certainly be really hard to find test cases that exercise this code. >> >> In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF >> we do not have to touch the handling of the outer modifier. However we pass >> EXPAND_REFERENCE to the inner object, which should not be a recursive >> use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF. >> >> TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle >> EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL. >> >> >> Boot-strapped and regression-tested on x86_64-linux-gnu >> OK for trunk? > > You point to a weak spot in expansion - that it handles creating > the base MEM to offset with handled components by recursing > into the case that handles bare MEM_REFs. This makes the > bare MEM_REF handling code somewhat awkward (it's the > one to assign mem-attrs which are later adjusted for example). > > Maybe a better appropach than adding yet another expand > modifier would be to split out the "base MEM" expansion part > out of the bare MEM_REF handling code so we can call that > instead of recursing. > > In this light - instead of a new expand modifier don't you want > an actual flag that specifies we are coming from a call that > wants to expand a base? That is, allow EXPAND_SUM > but with the recursion flag set? > I think you are right. After some thought, I start to like that idea. This way we have at least much more flexibility, how to handle the inner references correctly, and if I change only the interfaces of expand_expr_real/real_1 that will not be used at too many places, either. > Finally I think the recursion into the VIEW_CONVERT_EXPR case > is only there because of the keep_aligning flag of get_inner_reference > which should be obsolete now that we properly handle its effects > in get_object_alignment. So you wouldn't need to adjust this path > if we finally can get rid of that. > I proposed a patch for that, but did not hear from you: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner reference, the code there should be kept in sync with the normal_inner_ref, and MEM_REF, IMHO. Attached, my updated patch for PR57748, Part 2. Boot-strapped and regression-tested on X86_64-pc-linux-gnu. Ok for trunk? Thanks Bernd. > What do others think? > > Thanks, > Richard. > >> Thanks >> Bernd.
RE: [PATCH, PR 57748] Check for out of bounds access, Part 2
oops - this time with attachments... > Hi, > > On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote: >> >> On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger >> wrote: >>> Hi, >>> Eh ... even register struct { int i; int a[0]; } asm ("ebx"); works. Also with int a[1] but not with a[2]. So just handling trailing arrays makes this case regress as well. Now I'd call this use questionable ... but we've likely supported that for decades and cannot change that now. Back to fixing everything in expand. Richard. >>> >>> Ok, finally you asked for it. >>> >>> Here is my previous version of that patch again. >>> >>> I have now added a new value "EXPAND_REFERENCE" to the expand_modifier >>> enumeration. It is almost like EXPAND_MEMORY but it does not interfere with >>> constant values. >>> >>> I have done the same modification to VIEW_CONVERT_EXPR too, because >>> this is a possible inner reference, itself. It is however inherently hard to >>> test around this code. >>> >>> To understand this patch it is good to know what type of object the >>> return value "tem" of get_inner_reference can be. >>> >>> From the program logic at get_inner_reference it is clear that the >>> return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF, >>> ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may >>> be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably >>> further restricted because exp is gimplified. >>> >>> Usually the result will be a MEM_REF or a SSA_NAME of the memory where >>> the structure is to be found. >>> >>> When you look at where EXPAND_MEMORY is handled you see it is special-cased >>> in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, >>> ARRAY_RANGE_REF. >>> >>> At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the >>> same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given: >>> If it is an unaligned memory, we just return the unaligned reference. >>> >>> This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test >>> case, >>> because it is only a problem for STRICT_ALIGNMENT targets, and even there >>> it will >>> certainly be really hard to find test cases that exercise this code. >>> >>> In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF >>> we do not have to touch the handling of the outer modifier. However we pass >>> EXPAND_REFERENCE to the inner object, which should not be a recursive >>> use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF. >>> >>> TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle >>> EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL. >>> >>> >>> Boot-strapped and regression-tested on x86_64-linux-gnu >>> OK for trunk? >> >> You point to a weak spot in expansion - that it handles creating >> the base MEM to offset with handled components by recursing >> into the case that handles bare MEM_REFs. This makes the >> bare MEM_REF handling code somewhat awkward (it's the >> one to assign mem-attrs which are later adjusted for example). >> >> Maybe a better appropach than adding yet another expand >> modifier would be to split out the "base MEM" expansion part >> out of the bare MEM_REF handling code so we can call that >> instead of recursing. >> >> In this light - instead of a new expand modifier don't you want >> an actual flag that specifies we are coming from a call that >> wants to expand a base? That is, allow EXPAND_SUM >> but with the recursion flag set? >> > > I think you are right. After some thought, I start to like that idea. > > This way we have at least much more flexibility, how to handle the inner > references correctly, and if I change only the interfaces of > expand_expr_real/real_1 > that will not be used at too many places, either. > >> Finally I think the recursion into the VIEW_CONVERT_EXPR case >> is only there because of the keep_aligning flag of get_inner_reference >> which should be obsolete now that we properly handle its effects >> in get_object_alignment. So you wouldn't need to adjust this path >> if we finally can get rid of that. >> > > I proposed a patch for that, but did not hear from you: > http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html > > nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner > reference, the code there should be kept in sync with the normal_inner_ref, > and MEM_REF, IMHO. > > Attached, my updated patch for PR57748, Part 2. > > Boot-strapped and regression-tested on X86_64-pc-linux-gnu. > > Ok for trunk? > > > Thanks > Bernd. > >> What do others think? >> >> Thanks, >> Richard. >> >>> Thanks >>> Bernd.2013-11-07 Bernd Edlinger PR middle-end/57748 * expr.h (expand_expr_real, expand_expr_real_1): Add new parameter expand_reference. (expand_expr, expand_normal): Adjust. * expr.c (expand_expr_real, expand_expr_real_1): Add new parame
[C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)
Hi, at -O0 we ICE for this valid testcase in output_constant because NULLPTR_TYPE is not handled. Should I dig further because we never want nullptr to reach varasm? Because otherwise the patchlet works fine, I checked that expand_expr can cope perfectly well with NULLPTR_TYPE as TREE_TYPE of the INTEGER_CST (generally, we are already handling NULLPTR_TYPE in a number of places outside /cp) Thanks, Paolo. Index: testsuite/g++.dg/cpp0x/nullptr30.C === --- testsuite/g++.dg/cpp0x/nullptr30.C (revision 0) +++ testsuite/g++.dg/cpp0x/nullptr30.C (working copy) @@ -0,0 +1,40 @@ +// PR c++/58176 +// { dg-do compile { target c++11 } } + +// Nil +struct nil_ { constexpr nil_ () {} }; +constexpr nil_ nil; + +// Cons +template +struct cons_ { +using head_ = H; +using tail_ = T; + +H head; +T tail; + +constexpr cons_() {} +constexpr cons_(H const &h, T const &t) : head(h), tail(t) {} +}; +template +constexpr cons_ cons (H const &h, T const &t = nil) { return +cons_(h,t); } + +// List +template struct list_s; +template +struct list_s { +using type = cons_::type>; +}; +template <> +struct list_s<> { +using type = nil_; +}; +template +using list_ = typename list_s::type; +constexpr nil_ list () { return nil; } +template +constexpr list_ list (H h, T... t) { return cons(h, list(t...)); } + +constexpr auto l1 = list("monkey", 123.4, cons(1, 2), nullptr); Index: varasm.c === --- varasm.c(revision 204508) +++ varasm.c(working copy) @@ -4685,6 +4685,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT case OFFSET_TYPE: case FIXED_POINT_TYPE: case POINTER_BOUNDS_TYPE: +case NULLPTR_TYPE: if (! assemble_integer (expand_expr (exp, NULL_RTX, VOIDmode, EXPAND_INITIALIZER), MIN (size, thissize), align, 0))
Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart
On 11/07/2013 05:36 AM, Basile Starynkevitch wrote: On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote: I decided to name the new file gimple-expr.[ch] instead of gimple-decl This will eventually split into gimple-type.[ch], gimple-decl.[ch], and gimple-expr.[ch]. Since we are adding *new* C++ files, can't we please name them *.cc for the implementation part, so at least create gimple-expr.h and gimple-expr.cc but not gimple-expr.c, please! Assuming we put this into stage 1 next year, I would imagine we'd rename a number of things, including .cc, drop the tree-* from the tree-ssa-blah.[c]h files, etc etc. There have been a few things people have suggested renaming... I think if we do renaming, they ought to be all at one time to minimize the pain. At the moment, the new gimple-* files I'm creating are still C, so they are .c files... Andrew
Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables
On Thu, 7 Nov 2013, Mingjie Xing wrote: > 2013/11/6 Richard Biener : > > You miss a testcase. > > > > Also why should the warning be omitted for unused automatic > > volatile variables? They cannot be used in any way. > > > > Richard. > > Thanks. I've updated the patch with a test case. You don't seem to have answered Richard's question about why the change is desirable in the first place. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)
On Thu, 7 Nov 2013, Richard Biener wrote: > Well, I'm betting that you'll re-invent sth like 'tree' just don't > call it 'tree' ;) > You need to transparently refer to constants, SSA names and decls > (at least) as GIMPLE statement operands. You probably will make > a "gimple statement operand" base class. Well - that's a 'tree' ;) Uses of constants and decls in expressions are why it's not obvious that those get a static type different from expressions (or at least, they probably do need a common base class). But my model of how "tree" should ideally be split up has expressions (maybe including decls) as a completely separate static type from types, and identifiers as yet another static type, none of those inheriting from a common base class "tree". -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, MPX, 2/X] Pointers Checker [6/25] Instrumentation pass
On Thu, 7 Nov 2013, Ilya Enkovich wrote: > Here is a new split. I still include rtl.h into tree-chkp.c for MEM_P > usage. Is it OK? I have no further comments on this patch. -- Joseph S. Myers jos...@codesourcery.com
Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)
Hi! On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek wrote: > On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote: > > --- gcc/config/arc/arc.h > > +++ gcc/config/arc/arc.h > > @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3. If not see > > %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} > > %{r}\ > > %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ > > %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ > > -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > > +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > > Perhaps add here |ftree-parallelize-loops=* too? [...] Like this (for trunk and gomp-4_0-branch)? gcc/ * config/arc/arc.h (LINK_COMMAND_SPEC): For -ftree-parallelize-loops=*, link to libgomp and its dependencies. * config/ia64/hpux.h (LIB_SPEC): Likewise. * config/pa/pa-hpux11.h (LIB_SPEC): Likewise. * config/pa/pa64-hpux.h (LIB_SPEC): Likewise. * gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's dependencies. libgomp/ * libgomp.spec.in: Update comment about libgomp's dependencies. * configure.ac: Likewise. * configure: Regenerate. diff --git gcc/config/arc/arc.h gcc/config/arc/arc.h index 637f7b6..87908d4 100644 --- gcc/config/arc/arc.h +++ gcc/config/arc/arc.h @@ -174,7 +174,8 @@ along with GCC; see the file COPYING3. If not see %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\ %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ +%{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\ +%(mflib)\ %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\ %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}" diff --git gcc/config/ia64/hpux.h gcc/config/ia64/hpux.h index ca592e4..517132c 100644 --- gcc/config/ia64/hpux.h +++ gcc/config/ia64/hpux.h @@ -92,7 +92,7 @@ do { \ #undef LIB_SPEC #define LIB_SPEC \ "%{!shared: \ - %{mt|pthread:%{fopenmp:-lrt} -lpthread} \ + %{mt|pthread:%{fopenmp|ftree-parallelize-loops=*:-lrt} -lpthread} \ %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \ %{mlp64:-L/usr/lib/hpux64/libp} -lprof} \ %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \ diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h index c217398..4da1b09 100644 --- gcc/config/pa/pa-hpux11.h +++ gcc/config/pa/pa-hpux11.h @@ -120,7 +120,8 @@ along with GCC; see the file COPYING3. If not see #undef LIB_SPEC #define LIB_SPEC \ "%{!shared:\ - %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\ + %{fopenmp|ftree-parallelize-loops=*:%{static:-a archive_shared} -lrt\ + %{static:-a archive}}\ %{mt|pthread:-lpthread} -lc\ %{static:%{!nolibdld:-a archive_shared -ldld -a archive -lc}\ %{!mt:%{!pthread:-a shared -lc -a archive\ diff --git gcc/config/pa/pa64-hpux.h gcc/config/pa/pa64-hpux.h index a3e21d3..9efc137 100644 --- gcc/config/pa/pa64-hpux.h +++ gcc/config/pa/pa64-hpux.h @@ -58,19 +58,22 @@ along with GCC; see the file COPYING3. If not see #if ((TARGET_DEFAULT | TARGET_CPU_DEFAULT) & MASK_GNU_LD) #define LIB_SPEC \ "%{!shared:\ - %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\ + %{!p:%{!pg:%{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\ + %{static:-a archive}}\ %{mt|pthread:-lpthread} -lc\ %{static:%{!nolibdld:-a shared -ldld -a archive -lc}\ %{!mt:%{!pthread:-a shared -lc -a archive}\ %{p:%{!pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\ -lprof %{static:-a archive}\ - %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\ + %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\ + %{static:-a archive}}\ %{mt|pthread:-lpthread} -lc\ %{static:%{!nolibdld:-a shared -ldld -a archive -lc}\ %{!mt:%{!pthread:-a shared -lc -a archive}\ %{pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\ -lgprof %{static:-a archive}\ - %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\ + %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\ + %{static:-a archive}}\ %{mt|pthread:-lpthread} -lc\ %{static:%{!nolibdld:-a shared -ldld -a archive -lc}\ %{!mt:%{!pthread:-a shared -lc -a archive}\ @@ -78,19 +81,22 @@ along with GCC; see the file COPYING3. If not see #else #define LIB_SPEC \ "%{!shared:\ - %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\ + %{!p:%{!pg:%{fopenmp|ftree-
PATCH: Don't set misaligned_prologue_used twice
I checked in this patch to remove redundant setting of misaligned_prologue_used. H.J. --- Index: ChangeLog === --- ChangeLog (revision 204511) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2013-11-07 H.J. Lu + + * config/i386/i386.c (ix86_expand_set_or_movmem): Don't set + misaligned_prologue_used when it has been set. + 2013-11-07 Yury Gribov Jakub Jelinek Index: config/i386/i386.c === --- config/i386/i386.c (revision 204511) +++ config/i386/i386.c (working copy) @@ -23796,7 +23796,6 @@ ix86_expand_set_or_movmem (rtx dst, rtx if (misaligned_prologue_used) { /* Misaligned move prologue handled small blocks by itself. */ - misaligned_prologue_used = true; expand_set_or_movmem_prologue_epilogue_by_misaligned_moves (dst, src, &destreg, &srcreg, move_mode, promoted_val, vec_promoted_val,
Re: [C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)
OK.
Re: [PATCH] remove nolonger needed {cgraph,varpool}_node_{,asm_}name () functions
On 11/07/2013 01:58 AM, tsaund...@mozilla.com wrote: > From: Trevor Saunders > > Hi, > > I admit its another c++ification, but the functions are useless, and if I'm > changing the name of the called function its just easier to drop the class > name > bit all together. > > Trev > >[...] > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index 4f93713..754d245 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -44,6 +44,12 @@ class GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"), >symtab_node > { > public: > + /* Return name. */ > + const char *name () const; > + > + /* Return asm name. */ > +const char * asm_name () const; > + Missing indentation? -- VZ
Re: [Patch, RTL] Eliminate redundant vec_select moves.
Richard Sandiford wrote: Tejas Belagod writes: Richard Sandiford wrote: Tejas Belagod writes: Richard Sandiford wrote: Tejas Belagod writes: + /* This is big-endian-safe because the elements are kept in target + memory order. So, for eg. PARALLEL element value of 2 is the same in + either endian-ness. */ + if (GET_CODE (src) == VEC_SELECT + && REG_P (XEXP (src, 0)) && REG_P (dst) + && REGNO (XEXP (src, 0)) == REGNO (dst)) +{ + rtx par = XEXP (src, 1); + int i; + + for (i = 0; i < XVECLEN (par, 0); i++) + { + rtx tem = XVECEXP (par, 0, i); + if (!CONST_INT_P (tem) || INTVAL (tem) != i) + return 0; + } + return 1; +} + I think for big endian it needs to be: INTVAL (tem) != i + base where base is something like: int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0); E.g. a big-endian V4HI looks like: msb lsb and shortening it to say V2HI only gives the low 32 bits: msb lsb But, in this case we want msb lsb It depends on whether the result occupies a full register or not. I was thinking of the case where it didn't, but I realise now you were thinking of the case where it did. And yeah, my suggestion doesn't cope with that... I was under the impression that the const vector parallel for vec_select represents the element indexes of the array in memory order. Therefore, in bigendian, msb lsb element a[0] a[1] a[2] a[3] and in littleendian msb lsb element a[3] a[2] a[1] a[0] Right. But if an N-bit value is stored in a register, it's assumed to occupy the lsb of the register and the N-1 bits above that. The other bits in the register are don't-care. E.g., leaving vectors to one side, if you have: (set (reg:HI N) (truncate:SI (reg:SI N))) on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this: msb lsb 01234567 4567 rather than: msb lsb 01234567 0123 for both endiannesses. The same principle applies to vectors. The lsb of the register is always assumed to be significant. So maybe the original patch was correct for partial-register and full-register results on little-endian, but only for full-register results on big-endian. Ah, ok! I think I get it. By eliminating set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] using the check INTVAL (tem) != i, I'm essentially making subsequent operations use (reg:V2DI n) in DI mode which is a partial register result and this gives me the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial register mode, I have to make sure the correct elements coincide with the lsb in big-endian... Thanks for your input, I'll apply the offset correction for big-endian you suggested. I'll respin the patch. Thanks. Just for avoidance of doubt, the result might be a full or partial register, depending on the mode and target. I was trying to correct myself by agreeing that your original was right and mine was wrong for big-endian if the result is a full register. What I had in mind when I implemented this was a partial-reg result, but obviously it was wrong. Sorry, I'm going to take a step back - I'm trying to figure out what a full register result would look like. Looking at the pattern, set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] the result is always a mode smaller than the src if the vec_select selects a subset of lanes. Plus if the src and dst are the same reg, because we're re-writing the src reg, wouldn't it always end up being a partial-reg? In this case, wouldn't (reg:DI n) always represent msb lsb Thanks, Tejas. I don't know if there are existing helper functions for this kind of thing. Richard
Re: [PATCH] Use get_range_info during number of iterations analysis
On Wed, Nov 06, 2013 at 03:21:04PM -0700, Jeff Law wrote: > >2013-11-06 Jakub Jelinek > > > > * tree-ssa-loop-niter.c: Include tree-ssanames.h. > > (determine_value_range): Add loop argument. Use get_range_info to > > improve range. > > (bound_difference): Adjust caller. > Clever, using the range from the PHI in the loop header. While it > may be too wide, it's likely going to be a lot better than nothing. > > Add a testcase and this is OK. Ok, thanks, here is what I've committed after another bootstrap/regtest. 2013-11-07 Jakub Jelinek * tree-ssa-loop-niter.c: Include tree-ssanames.h. (determine_value_range): Add loop argument. Use get_range_info to improve range. (bound_difference): Adjust caller. * gcc.dg/tree-ssa/loop-39.c: New test. --- gcc/tree-ssa-loop-niter.c.jj2013-10-24 10:19:21.0 +0200 +++ gcc/tree-ssa-loop-niter.c 2013-11-07 11:49:37.568083138 +0100 @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. #include "diagnostic-core.h" #include "tree-inline.h" #include "tree-pass.h" +#include "tree-ssanames.h" #define SWAP(X, Y) do { affine_iv *tmp = (X); (X) = (Y); (Y) = tmp; } while (0) @@ -119,9 +120,12 @@ split_to_var_and_offset (tree expr, tree in TYPE to MIN and MAX. */ static void -determine_value_range (tree type, tree var, mpz_t off, +determine_value_range (struct loop *loop, tree type, tree var, mpz_t off, mpz_t min, mpz_t max) { + double_int minv, maxv; + enum value_range_type rtype = VR_VARYING; + /* If the expression is a constant, we know its value exactly. */ if (integer_zerop (var)) { @@ -130,9 +134,73 @@ determine_value_range (tree type, tree v return; } + get_type_static_bounds (type, min, max); + + /* See if we have some range info from VRP. */ + if (TREE_CODE (var) == SSA_NAME && INTEGRAL_TYPE_P (type)) +{ + edge e = loop_preheader_edge (loop); + gimple_stmt_iterator gsi; + + /* Either for VAR itself... */ + rtype = get_range_info (var, &minv, &maxv); + /* Or for PHI results in loop->header where VAR is used as +PHI argument from the loop preheader edge. */ + for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + double_int minc, maxc; + if (PHI_ARG_DEF_FROM_EDGE (phi, e) == var + && (get_range_info (gimple_phi_result (phi), &minc, &maxc) + == VR_RANGE)) + { + if (rtype != VR_RANGE) + { + rtype = VR_RANGE; + minv = minc; + maxv = maxc; + } + else + { + minv = minv.max (minc, TYPE_UNSIGNED (type)); + maxv = maxv.min (maxc, TYPE_UNSIGNED (type)); + gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) <= 0); + } + } + } + if (rtype == VR_RANGE) + { + mpz_t minm, maxm; + gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) <= 0); + mpz_init (minm); + mpz_init (maxm); + mpz_set_double_int (minm, minv, TYPE_UNSIGNED (type)); + mpz_set_double_int (maxm, maxv, TYPE_UNSIGNED (type)); + mpz_add (minm, minm, off); + mpz_add (maxm, maxm, off); + /* If the computation may not wrap or off is zero, then this +is always fine. If off is negative and minv + off isn't +smaller than type's minimum, or off is positive and +maxv + off isn't bigger than type's maximum, use the more +precise range too. */ + if (nowrap_type_p (type) + || mpz_sgn (off) == 0 + || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0) + || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0)) + { + mpz_set (min, minm); + mpz_set (max, maxm); + mpz_clear (minm); + mpz_clear (maxm); + return; + } + mpz_clear (minm); + mpz_clear (maxm); + } +} + /* If the computation may wrap, we know nothing about the value, except for the range of the type. */ - get_type_static_bounds (type, min, max); if (!nowrap_type_p (type)) return; @@ -405,8 +473,8 @@ bound_difference (struct loop *loop, tre mpz_init (maxx); mpz_init (miny); mpz_init (maxy); - determine_value_range (type, varx, offx, minx, maxx); - determine_value_range (type, vary, offy, miny, maxy); + determine_value_range (loop, type, varx, offx, minx, maxx); + determine_value_range (loop, type, vary, offy, miny, maxy); mpz_sub (bnds->below, minx, maxy); mpz_sub (bnds->up, maxx, miny); --- gcc/testsuite/gcc.dg/tree-ssa/loop-39.c.jj 2013-11-07 11:52:01.871309995 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/loop-39.c 2013-11-07 11
Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)
On Thu, Nov 7, 2013 at 2:44 PM, Joseph S. Myers wrote: > On Thu, 7 Nov 2013, Richard Biener wrote: > >> Well, I'm betting that you'll re-invent sth like 'tree' just don't >> call it 'tree' ;) >> You need to transparently refer to constants, SSA names and decls >> (at least) as GIMPLE statement operands. You probably will make >> a "gimple statement operand" base class. Well - that's a 'tree' ;) > > Uses of constants and decls in expressions are why it's not obvious that > those get a static type different from expressions (or at least, they > probably do need a common base class). But my model of how "tree" should > ideally be split up has expressions (maybe including decls) as a > completely separate static type from types, and identifiers as yet another > static type, none of those inheriting from a common base class "tree". Sure. Also you make containers no longer trees (TREE_CHAIN anyone, TREE_VEC, TREE_LIST ...). Should be the first step I think (already started some time ago with using VECs for some pieces). Richard. > -- > Joseph S. Myers > jos...@codesourcery.com
RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
On Thu, 7 Nov 2013 13:48:14, Jakub Jelinek wrote: > On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote: >> just some thoughts... >> >> >> fgetc will definitely be much faster than fread 16K and fseek back to the >> end of line position. >> >> Note: fgetc is already buffered and not too slow on average, but only if you >> do not >> fseek around. In that case the buffered file-stream data is lost. > > fgetc (unless you are talking about fgetc_unlocked which is somewhat better) > is actually very expensive. Of course, fseek is highly undersirable, but > I've already said that. > > Jakub yes. That's true. I think, maybe this patch tries just to reach too many goals at the same time. What is the _real_ bug which should be fixed first, is that even a very short file with NUL-chars goes OOM in read_line. So stay with fgets, just detect that the file is binary and return in that case. The performance and duplicate code can still be improved by a follow-up patch. We do not really need any caret positions on a binary file, after all, we just should not ICE. Maybe that could be done by just detecting that we have a BINARY file, and stop printing any further lines. like this? --- input.c.jj 2013-01-10 21:38:27.0 +0100 +++ input.c 2013-11-07 15:33:53.804990865 +0100 @@ -106,12 +106,15 @@ read_line (FILE *file) { size_t len = strlen (string + pos); - if (string[pos + len - 1] == '\n') + if (len && string[pos + len - 1] == '\n') { string[pos + len - 1] = 0; return string; } pos += len; + /* test if binary file or last line incomplete? */ + if (pos < string_len-1) + return pos && feof (file) ? string : NULL; string = XRESIZEVEC (char, string, string_len * 2); string_len *= 2; } Bernd.
Re: Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)
On Thu, Nov 07, 2013 at 02:58:31PM +0100, Thomas Schwinge wrote: > On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek wrote: > > On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote: > > > --- gcc/config/arc/arc.h > > > +++ gcc/config/arc/arc.h > > > @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3. If not see > > > %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} > > > %{n} %{r}\ > > > %{s} %{t} %{u*} %{x} %{z} %{Z} > > > %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ > > > %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ > > > -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > > > +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > > > > Perhaps add here |ftree-parallelize-loops=* too? [...] > > Like this (for trunk and gomp-4_0-branch)? > > gcc/ > * config/arc/arc.h (LINK_COMMAND_SPEC): For > -ftree-parallelize-loops=*, link to libgomp and its dependencies. > * config/ia64/hpux.h (LIB_SPEC): Likewise. > * config/pa/pa-hpux11.h (LIB_SPEC): Likewise. > * config/pa/pa64-hpux.h (LIB_SPEC): Likewise. > * gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's > dependencies. > libgomp/ > * libgomp.spec.in: Update comment about libgomp's dependencies. > * configure.ac: Likewise. > * configure: Regenerate. Yes, thanks. Jakub
Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)
On 11/07/2013 05:08 AM, Richard Biener wrote: 2 - I really believe gimple needs a type system different from front end trees, that is my primary motivation. I'm tired of jumping through hoops to do anything slightly different, and I got fed up with it. With a separate type system for gimple, we can rid ourselves of all the stuff that isn't related to optimization and codegen... Could we do this with trees? possibly, but if I'm going to go to the effort of converting the front end tree types into a new or reduced-type subset, I might as well put that effort into something that is more appropriate right from the start. I'm not sure. Especially the idea of wrapping everything in a I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me. And I put that into the very same basket of things you now complain about. Well, this first phase is a transitional one... If it were possible, I'd change them all at once to not use a tree... but I don't think that is feasible. There is too much to find and change, and the source base is constantly changing. We can't just freeze the base for a year. so until I can get them all changes, they have to act like trees and interact like trees, so we make them trees under the covers. Short term pain and discomfort... If it helps, during the transition you can view it as replacing trees with oddly named tree subclasses which provide static type checking instead of run time GIMPLE_CHECK() calls :-) By replacing those uses with an actual type, we'll get static type checking which is another improvement. There is only one way to move the middle/back end off of types as trees that I can see. Go through the source base and find all the places which use trees as types, and replace them with an actual type. This can't be done all at once, its too big a job... So I propose a plan which allows a new gimple type and current trees to coexist. Once the conversion is complete we can easily implement a type system that has the features and traits we want. Note that on 'gimple' there are no types ... the types are on the objects 'gimple' refers to and those are 'tree's. So you'd need to change the 'tree's to have TREE_TYPE not be a 'tree'. Or simply subclass 'tree' so you can distinguish a tree type from a tree non-type statically (a bit awkward with the current union setup). But maybe I'm confusing what you call 'type'. Im talking about changing the type gimple statements refer to, and also what decls and expressions will refer to... making that the gimple_type. Today, thats a TREE_TYPE tree. During the transition period this gimple_type is indeed a tree under the covers.. so that all the existing code can work even if it hasnt been converted yet. Once everything is converted, then gimple_type has the tree removed from its declaration and we replace it with the bits we need/want... and it will no longer function as a tree.. gimplification will create these from the trees the front end created. 4 - I realized that if I am going through the entire code base to find all the type locations, its very little extra work to also do the same for decls and expressions. Then we'll have static type checking for those as well. 5 - When done we will be significantly closer to having the ability to create a self contained IL that we can strictly define... That alone seems a worthwhile goal to me. The IL then becomes appropriate for streaming at any point in during compilation. This means any front end or tool can easily generate or consume a gimple stream which opens up numerous opportunities... That is pretty hard to enable today. Have fun with dealing with our legacy frontends ;) Or didn't you actually mean "any point" during compilation? ... oh, you talk about GIMPLE. So no plans for Frontend ASTs aka GENERIC for our legacy frontends? I meant any point where we are in gimple... Thats the only point we really care about. Once control is passed to the middle end. No plans to touch GENERIC at all... in fact, GENERIC and 'tree' will become true synonyms when I'm done, because gimple wont have any trees in it anywhere. And rtl will also refer to gimple nodes. It just wont matter what warts are in trees. Im going to do the initial bits on a branch for the first couple of months, and during that time convert a number of files over to the new gimple interface. Then we can look at it, measure it, comment on it, tweak it, whatever. You can see exactly what we'll be getting into before we fully commit. Before we fully commit we want to see the underlying object changes. Because I still doubt this all will work out without seriously restructuring the entanglement of our tree-based frontends and the middle-end (say "fold"). That is, I still think you are starting this at the wrong end. You first need to solve the entanglement issue before you can seriously consider making any real change to the data structures GIMPLE relie
Re: [Patch, RTL] Eliminate redundant vec_select moves.
Tejas Belagod writes: > Richard Sandiford wrote: >> Tejas Belagod writes: >>> Richard Sandiford wrote: Tejas Belagod writes: > Richard Sandiford wrote: >> Tejas Belagod writes: >>> + /* This is big-endian-safe because the elements are kept in target >>> + memory order. So, for eg. PARALLEL element value of 2 is the >>> same in >>> + either endian-ness. */ >>> + if (GET_CODE (src) == VEC_SELECT >>> + && REG_P (XEXP (src, 0)) && REG_P (dst) >>> + && REGNO (XEXP (src, 0)) == REGNO (dst)) >>> +{ >>> + rtx par = XEXP (src, 1); >>> + int i; >>> + >>> + for (i = 0; i < XVECLEN (par, 0); i++) >>> + { >>> + rtx tem = XVECEXP (par, 0, i); >>> + if (!CONST_INT_P (tem) || INTVAL (tem) != i) >>> + return 0; >>> + } >>> + return 1; >>> +} >>> + >> I think for big endian it needs to be: >> >> INTVAL (tem) != i + base >> >> where base is something like: >> >> int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN >> (par, 0); >> >> E.g. a big-endian V4HI looks like: >> >> msb lsb >> >> >> and shortening it to say V2HI only gives the low 32 bits: >> >> msb lsb >> > But, in this case we want > > msb lsb > It depends on whether the result occupies a full register or not. I was thinking of the case where it didn't, but I realise now you were thinking of the case where it did. And yeah, my suggestion doesn't cope with that... > I was under the impression that the const vector parallel for vec_select > represents the element indexes of the array in memory order. > > Therefore, in bigendian, > > msb lsb > > element a[0] a[1] a[2] a[3] > > and in littleendian > > msb lsb > > element a[3] a[2] a[1] a[0] Right. But if an N-bit value is stored in a register, it's assumed to occupy the lsb of the register and the N-1 bits above that. The other bits in the register are don't-care. E.g., leaving vectors to one side, if you have: (set (reg:HI N) (truncate:SI (reg:SI N))) on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this: msb lsb 01234567 4567 rather than: msb lsb 01234567 0123 for both endiannesses. The same principle applies to vectors. The lsb of the register is always assumed to be significant. So maybe the original patch was correct for partial-register and full-register results on little-endian, but only for full-register results on big-endian. >>> Ah, ok! I think I get it. By eliminating >>> set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] >>> >>> using the check INTVAL (tem) != i, I'm essentially making subsequent >>> operations >>> use (reg:V2DI n) in DI mode which is a partial register result and this >>> gives me >>> the wrong set of lanes in bigendian. So, if I want to use (reg n) in >>> partial >>> register mode, I have to make sure the correct elements coincide with >>> the lsb in >>> big-endian... >>> >>> Thanks for your input, I'll apply the offset correction for big-endian you >>> suggested. I'll respin the patch. >> >> Thanks. Just for avoidance of doubt, the result might be a full or >> partial register, depending on the mode and target. I was trying to >> correct myself by agreeing that your original was right and mine was >> wrong for big-endian if the result is a full register. >> > > What I had in mind when I implemented this was a partial-reg result, but > obviously it was wrong. > > Sorry, I'm going to take a step back - I'm trying to figure out what a full > register result would look like. Looking at the pattern, > >set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] > > the result is always a mode smaller than the src if the vec_select selects a > subset of lanes. Plus if the src and dst are the same reg, because we're > re-writing the src reg, wouldn't it always end up being a partial-reg? > In this case, wouldn't (reg:DI n) always represent > > msb lsb > The problem is that one reg rtx can span several hard registers. E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32), but it might instead represent two 32-bit registers (nos. 32 and 33). Obviously the latter's not very likely for vectors this small, but more likely for larger ones (including on NEON IIRC). So if we had 2 32-bit registers being treated as a V4HI, it would be: <--32--><--33--> ms
Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))
Hi, On Wed, 6 Nov 2013, David Malcolm wrote: > > I don't like that. The empty classes are just useless, they imply a > > structure that isn't really there, some of the separate gimple codes > > are basically selectors of specific subtypes of a generic concept, > > without additional data or methods; creating a type for those is > > confusing. > > A type system does more than just express memory layouts: I know all that. But there's a point on the scale from a void* type system to a different-type-for-everything system where the advantages of a more complex type system are anulled by the disadvantages of a more complex type system. And I'm merely of the opinion that you crossed that point with this separation. You know we could have a typesystem where every different natural number is a type (and with C++ we _really_ could do that). Obviously that's also a reductio ad absurdum, but it demonstrates the point that there _are_ type systems with a too low complexity to expressiveness ratio. > * it permits the proof of absence of certain bugs > * it supports abstraction > * it documents the intent of the code Well, yes. You can document code with other things than types, you can add abstractions via different means, and certain bugs can indeed only be ruled out by different static types. In my mind types should abstract different larger concepts foremost (and not every little aspect of a concept). Only _then_ should they be used for other effects like the above, and only if the disadvantages don't overshadow that. Disadvantages being mostly simply more complexity: which types relate to others in which way; what are the methods you can do with that but not the other type; which are the members of the 100 types there are (and if a type system is too large to be used only by IDEs with code completion, then it's too complex); and of course also simply having to write more text when you can't use the same gimple variable for several statements. > To give an example from the patch, the proposed gimple_statement_switch > subclass of gimple_statement_with_ops adds no extra fields, but it adds > the invariant that (assuming the ptr is non-NULL), that code == > GIMPLE_SWITCH, or, more intuitively, "it's a switch statement". > > This means that if we have a gimple_statement_switch, that both a human > reading the code and the compiler can "know" at compile-time that the > code == GIMPLE_SWITCH. But a gimple switch is actually a special case of a gimple conditional jump (the normal GIMPLE_COND being another) which itself is a special case of a gimple control transfer (a thing causing one or multiple outgoing edges and ending a basic block). That enlightens that your hierarchy might not actually be the best split of types, it's merely a formal translation of gimple codes into types. And I question exactly that automatic introduction of types. There's not enough human thought put into the hierarchy to warrant the change. Types are a fundament and they shouldn't be introduced lightly. As gimple codes are simply flat it doesn't matter so much that the inherent structure of gimple statements isn't reflected (although there _is_ some structure by having the with_ops and with_mem_ops codes be together). But for a _type hierarchy_ I would pretty much require that real thought and insights are put into it. > Consider tree-switch-conversion.c: this contains various "gimple" > variables but they don't work on arbitrary gimple; they require code == > GIMPLE_SWITCH: > ... > We could be doing this directly in the type system, and using the > compiler to check it. Then please do that while introducing one new type. > > The fewer > > types the better IMO. > > There's a reductio ad absurdum of that argument: See above for my variant of that showing that it also goes into the other direction of more types. Ciao, Michael. P.S.: And do away with the "_statement" part of the type, gimple_switch ought to be enough.
Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc
On Wed, Nov 6, 2013 at 11:55 PM, Vladimir Simonov wrote: >> -Original Message- >> From: Ian Lance Taylor [mailto:i...@google.com] >> Sent: Wednesday, November 06, 2013 6:42 PM >> To: Joey Ye >> Cc: gcc-patches; d...@redhat.com; Vladimir Simonov >> Subject: Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc > > Jan, thank you for your attention. > > It looks to me that you missed that the patch changes current gcc policy for > work with > pathname separators on "hosts" supporting both back and forward slashes from > neutral(undefined) behavior to more defined - "From now on hosts/builds (in > terms of host-build-target) > supporting both back and forward slashes gcc tries to use forward slashes > both in filenames saved > in binaries for target and for internal work." > And this patch is just first, little step in this direction. In fact the > patch was published just > to show problems and start discussion about ways for their solution. > > Above may not satisfy you and other gcc developers/consumers. > As minimum I'm interested in Mingw people opinion. > > Arguments for new policy are simple - this policy should not affect > "native" builds but helps a lot in case when host/build supports both kinds > of separators but > target supports only forward slashes. > > Without explicit consensus on above I see no sense in the patch details > discussion. Personally, I'm fine with that proposed change in policy. As somebody who does not use Windows, it does not affect me. I agree that getting the opinion of the mingw maintainers makes sense. I commented on the details of the patch because Joey pinged me to submit it, and I am a libiberty maintainer. If you want to discuss the above proposal, then by all means discuss it. Ian
[PATCH, ARM, LRA] Fixed bootstrap failure in Thumb mode
Hi, this patch fixed an LRA cycling due to secondary reload (Thumb mode). Notice that this patch is a prerequisite to turn on LRA by default on ARM. Bootstrapped on a9 and a15 without any regression in the testsuite as LRA is off by default and with the regression reported in the thread bellow when LRA is on. http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html Thanks, Yvan 2013-11-07 Yvan Roux * config/arm/arm.h (THUMB_SECONDARY_INPUT_RELOAD_CLASS): Return NO_REGS for LRA. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 1781b75..d054906 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -1266,11 +1266,12 @@ enum reg_class /* Must leave BASE_REGS reloads alone */ #define THUMB_SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X) \ - ((CLASS) != LO_REGS && (CLASS) != BASE_REGS\ - ? ((true_regnum (X) == -1 ? LO_REGS \ - : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE) > 8) ? LO_REGS \ - : NO_REGS)) \ - : NO_REGS) + (lra_in_progress ? NO_REGS \ + : ((CLASS) != LO_REGS && (CLASS) != BASE_REGS \ + ? ((true_regnum (X) == -1 ? LO_REGS\ + : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE) > 8) ? LO_REGS \ + : NO_REGS)) \ + : NO_REGS)) #define THUMB_SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X) \ ((CLASS) != LO_REGS && (CLASS) != BASE_REGS\
[PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)
Here, forward propagation turned _6 = a.1_5 & 1; _7 = (_Bool) _6; into _7 = (_Bool) a.1_5; but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1. If a is an odd number, this is correct, but we don't know the value of a, so I disabled this optimization for "x & 1" cases. For e.g. "x & 255", this still works correctly. In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were miscompiling the following testcase... Regtested/bootstrapped on x86_64-linux, ok for trunk? 2013-11-07 Marek Polacek PR tree-optimization/59014 * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't optimize x & 1 expressions here. testsuite/ * gcc.dg/torture/pr59014.c: New test. --- gcc/testsuite/gcc.dg/torture/pr59014.c.mp 2013-11-07 14:47:38.040941144 +0100 +++ gcc/testsuite/gcc.dg/torture/pr59014.c 2013-11-07 14:49:03.646271031 +0100 @@ -0,0 +1,26 @@ +/* PR tree-optimization/59014 */ +/* { dg-do run } */ + +int a = 2, b, c, d; + +int +foo () +{ + for (;; c++) +if ((b > 0) | (a & 1)) + ; +else + { + d = a; + return 0; + } +} + +int +main () +{ + foo (); + if (d != 2) +__builtin_abort (); + return 0; +} --- gcc/tree-ssa-forwprop.c.mp 2013-11-07 14:30:43.785733810 +0100 +++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100 @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple if (TREE_CODE (rhs_def_operand1) == SSA_NAME && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1) && TREE_CODE (rhs_def_operand2) == INTEGER_CST + /* We can't turn + _6 = a & 1; + _7 = (_Bool) _6; +into + _7 = (_Bool) a; +as that's wrong if A is an even number. */ + && ! integer_onep (rhs_def_operand2) && operand_equal_p (rhs_def_operand2, build_low_bits_mask (TREE_TYPE (rhs_def_operand2), - TYPE_PRECISION (lhs_type)), + TYPE_PRECISION (lhs_type)), 0)) { /* This is an optimizable case. Replace the source operand Marek
Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification
On 11/07/13 00:38, Jakub Jelinek wrote: On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote: Hmmm, good point. I've moved update_stmt and company to the caller, and modified the caller to call regimplify_operands only for GIMPLE_RETURN which is the special case. Can't you (later) handle that without regimplification too? Sure! See attached patch. But as discussed on IRC, I wonder whether we can do without the following in the attached patch: + tree repl = make_ssa_name (TREE_TYPE (retval), NULL); + stmt = gimple_build_assign (repl, retval); + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); + stmt = gimple_build_assign (ref, repl); ...and unconditionally do: + stmt = gimple_build_assign (ref, retval); ...since it seems all the GIMPLE_RETURNs I see can be replaced by ``retval_array[iter] = whatever'' without creating something non-gimple (thus avoiding an SSA variable). Either way, I'm ok. Let me know. Aldy gcc/ChangeLog.gomp * omp-low.c (ipa_simd_modify_function_body): Avoid regimplification of GIMPLE_RETURNs. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 14b8571..6039de9 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -11175,26 +11175,31 @@ ipa_simd_modify_function_body (struct cgraph_node *node, { case GIMPLE_RETURN: { - tree old_retval = gimple_return_retval (stmt); - if (old_retval) + tree retval = gimple_return_retval (stmt); + if (!retval) { - /* Replace `return foo' by `retval_array[iter] = foo'. */ - stmt = gimple_build_assign (build4 (ARRAY_REF, - TREE_TYPE (old_retval), - retval_array, iter, - NULL, NULL), - old_retval); - gsi_replace (&gsi, stmt, true); - gimple_regimplify_operands (stmt, &gsi); - info.modified = true; + gsi_remove (&gsi, true); + continue; } + + /* Replace `return foo' with `retval_array[iter] = foo'. */ + tree ref = build4 (ARRAY_REF, + TREE_TYPE (retval), + retval_array, iter, + NULL, NULL); + if (is_gimple_reg (retval)) + stmt = gimple_build_assign (ref, retval); else { - gsi_remove (&gsi, true); - continue; + tree repl = make_ssa_name (TREE_TYPE (retval), NULL); + stmt = gimple_build_assign (repl, retval); + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); + stmt = gimple_build_assign (ref, repl); } - break; + gsi_replace (&gsi, stmt, true); + info.modified = true; } + break; default: walk_gimple_op (stmt, ipa_simd_modify_stmt_ops, &wi);
Re: [PATCH] Add check for aarch64 in vect_cmdline_needed
On 11/6/13, 5:06 PM, Joseph S. Myers wrote: > You should be testing aarch64*-*-* so as to match aarch64_be targets. Thank you for catching that. Please commit this new patch if is OK. I don't have SVN access. Thanks, Cesar 2013-11-06 Cesar Philippidis gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_cmdline_neeed): Add aarch64 to the list of targets that do not need command line argument to enable SIMD. Index: gcc/testsuite/lib/target-supports.exp === --- gcc/testsuite/lib/target-supports.exp (revision 423006) +++ gcc/testsuite/lib/target-supports.exp (working copy) @@ -1920,7 +1920,8 @@ proc check_effective_target_vect_cmdline_needed { || [check_effective_target_powerpc_altivec])) || ([istarget sparc*-*-*] && [check_effective_target_sparc_vis]) || [istarget spu-*-*] -|| ([istarget arm*-*-*] && [check_effective_target_arm_neon]) } { +|| ([istarget arm*-*-*] && [check_effective_target_arm_neon]) +|| [istarget aarch64*-*-*] } { set et_vect_cmdline_needed_saved 0 } }
Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification
On Thu, Nov 07, 2013 at 08:17:13AM -0700, Aldy Hernandez wrote: > But as discussed on IRC, I wonder whether we can do without the > following in the attached patch: > > + tree repl = make_ssa_name (TREE_TYPE (retval), NULL); > + stmt = gimple_build_assign (repl, retval); > + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); > + stmt = gimple_build_assign (ref, repl); > > ...and unconditionally do: > > + stmt = gimple_build_assign (ref, retval); This should be sufficient. > > ...since it seems all the GIMPLE_RETURNs I see can be replaced by > ``retval_array[iter] = whatever'' without creating something > non-gimple (thus avoiding an SSA variable). > > Either way, I'm ok. Let me know. Thanks. Jakub
Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)
On Thu, Nov 07, 2013 at 04:15:11PM +0100, Marek Polacek wrote: > Here, forward propagation turned > _6 = a.1_5 & 1; > _7 = (_Bool) _6; > into > _7 = (_Bool) a.1_5; > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1. > If a is an odd number, this is correct, but we don't know the value > of a, so I disabled this optimization for "x & 1" cases. For e.g. "x & 255", > this still works correctly. > > In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were > miscompiling the following testcase... > > Regtested/bootstrapped on x86_64-linux, ok for trunk? Isn't the problem here not the constant 1, but the fact that TREE_CODE (lhs_type) == BOOLEAN_TYPE, with the special properties of boolean types? I mean, if lhs_type is say some normal INTEGER_TYPE with precision 1 (or say 3 and stmt & 3 etc.), then the cast alone should cover the required masking. > 2013-11-07 Marek Polacek > > PR tree-optimization/59014 > * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't > optimize x & 1 expressions here. > testsuite/ > * gcc.dg/torture/pr59014.c: New test. > > --- gcc/tree-ssa-forwprop.c.mp2013-11-07 14:30:43.785733810 +0100 > +++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100 > @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple >if (TREE_CODE (rhs_def_operand1) == SSA_NAME > && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1) > && TREE_CODE (rhs_def_operand2) == INTEGER_CST > + /* We can't turn > +_6 = a & 1; > +_7 = (_Bool) _6; > + into > +_7 = (_Bool) a; > + as that's wrong if A is an even number. */ > + && ! integer_onep (rhs_def_operand2) > && operand_equal_p (rhs_def_operand2, > build_low_bits_mask (TREE_TYPE (rhs_def_operand2), > -TYPE_PRECISION (lhs_type)), > +TYPE_PRECISION (lhs_type)), > 0)) > { > /* This is an optimizable case. Replace the source operand > > Marek Jakub
Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)
On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek wrote: > Here, forward propagation turned > _6 = a.1_5 & 1; > _7 = (_Bool) _6; > into > _7 = (_Bool) a.1_5; > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1. ? (_Bool) 2 should truncate the value to zero. Richard. > If a is an odd number, this is correct, but we don't know the value > of a, so I disabled this optimization for "x & 1" cases. For e.g. "x & 255", > this still works correctly. > In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were > miscompiling the following testcase... > > Regtested/bootstrapped on x86_64-linux, ok for trunk? > > 2013-11-07 Marek Polacek > > PR tree-optimization/59014 > * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't > optimize x & 1 expressions here. > testsuite/ > * gcc.dg/torture/pr59014.c: New test. > > --- gcc/testsuite/gcc.dg/torture/pr59014.c.mp 2013-11-07 14:47:38.040941144 > +0100 > +++ gcc/testsuite/gcc.dg/torture/pr59014.c 2013-11-07 14:49:03.646271031 > +0100 > @@ -0,0 +1,26 @@ > +/* PR tree-optimization/59014 */ > +/* { dg-do run } */ > + > +int a = 2, b, c, d; > + > +int > +foo () > +{ > + for (;; c++) > +if ((b > 0) | (a & 1)) > + ; > +else > + { > + d = a; > + return 0; > + } > +} > + > +int > +main () > +{ > + foo (); > + if (d != 2) > +__builtin_abort (); > + return 0; > +} > --- gcc/tree-ssa-forwprop.c.mp 2013-11-07 14:30:43.785733810 +0100 > +++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100 > @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple >if (TREE_CODE (rhs_def_operand1) == SSA_NAME > && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1) > && TREE_CODE (rhs_def_operand2) == INTEGER_CST > + /* We can't turn > + _6 = a & 1; > + _7 = (_Bool) _6; > +into > + _7 = (_Bool) a; > +as that's wrong if A is an even number. */ > + && ! integer_onep (rhs_def_operand2) > && operand_equal_p (rhs_def_operand2, > build_low_bits_mask (TREE_TYPE > (rhs_def_operand2), > - TYPE_PRECISION (lhs_type)), > + TYPE_PRECISION (lhs_type)), >0)) > { > /* This is an optimizable case. Replace the source operand > > Marek
Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)
On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote: > On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek wrote: > > Here, forward propagation turned > > _6 = a.1_5 & 1; > > _7 = (_Bool) _6; > > into > > _7 = (_Bool) a.1_5; > > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is > > 1. > > ? > > (_Bool) 2 > > should truncate the value to zero. You're right, say: _Bool foo (int x) { return (_Bool) (x & 1); } which due to the simplify_conversion_from_bitmask optimization is optimized into: _Bool _3; : _3 = (_Bool) x_1(D); return _3; is indeed expanded as: (set (reg:QI 90 [ D.1789 ]) (and:QI (reg:QI 90 [ D.1789 ]) (const_int 1 [0x1]))) Jakub
[PATCH] make has_gate and has_execute useless
From: Trevor Saunders Hi, This is the result of seeing what it would take to get rid of the has_gate and has_execute flags on pass_data. It turns out not much, but I wanted confirmation this part is ok before I go deal with all the places that initialize the fields. I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite regressions (ignoring the silk stuff because the full paths in its test names break my test script for now) Any reason this patch with the actual removal of the flags wouldn't be ok? Trev 2013-11-06 Trevor Saunders * pass_manager.h (pass_manager): Adjust. * passes.c (opt_pass::execute): Tell the pass manager it doesn't need to do anything for this pass. (pass_manager::register_dump_files_1): Don't uselessly deal with properties of passes. (pass_manager::register_dump_files): Adjust. (dump_one_pass): Just call pass->gate (). (execute_ipa_summary_passes): Likewise. (execute_one_pass): Don't check pass->has_execute flag. (ipa_write_summaries_2): Don't check pass->has_gate flag. (ipa_write_optimization_summaries_1): Likewise. (ipa_read_summaries_1): Likewise. (ipa_read_optimization_summaries_1): Likewise. (execute_ipa_stmt_fixups): Likewise. * tree-pass.h (pass_data): Rename has_gate to useless_has_gate and has_execute to useless_has_execute to be sure they're unused. (TODO_absolutely_nothing): New constant. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index 77d78eb..3bc0a99 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -93,7 +93,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - int register_dump_files_1 (struct opt_pass *pass, int properties); + void register_dump_files_1 (struct opt_pass *pass); void register_dump_files (struct opt_pass *pass, int properties); private: diff --git a/gcc/passes.c b/gcc/passes.c index 19e5869..3b28dc9 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -112,7 +112,7 @@ opt_pass::gate () unsigned int opt_pass::execute () { - return 0; + return TODO_absolutely_nothing; } opt_pass::opt_pass (const pass_data &data, context *ctxt) @@ -701,33 +701,21 @@ pass_manager::register_one_dump_file (struct opt_pass *pass) /* Recursive worker function for register_dump_files. */ -int +void pass_manager:: -register_dump_files_1 (struct opt_pass *pass, int properties) +register_dump_files_1 (struct opt_pass *pass) { do { - int new_properties = (properties | pass->properties_provided) - & ~pass->properties_destroyed; - if (pass->name && pass->name[0] != '*') register_one_dump_file (pass); if (pass->sub) -new_properties = register_dump_files_1 (pass->sub, new_properties); - - /* If we have a gate, combine the properties that we could have with - and without the pass being examined. */ - if (pass->has_gate) -properties &= new_properties; - else -properties = new_properties; +register_dump_files_1 (pass->sub); pass = pass->next; } while (pass); - - return properties; } /* Register the dump files for the pass_manager starting at PASS. @@ -739,7 +727,7 @@ pass_manager:: register_dump_files (struct opt_pass *pass,int properties) { pass->properties_required |= properties; - register_dump_files_1 (pass, properties); + register_dump_files_1 (pass); } struct pass_registry @@ -847,7 +835,7 @@ dump_one_pass (struct opt_pass *pass, int pass_indent) const char *pn; bool is_on, is_really_on; - is_on = pass->has_gate ? pass->gate () : true; + is_on = pass->gate (); is_really_on = override_gate_status (pass, current_function_decl, is_on); if (pass->static_pass_number <= 0) @@ -2002,7 +1990,7 @@ execute_ipa_summary_passes (struct ipa_opt_pass_d *ipa_pass) /* Execute all of the IPA_PASSes in the list. */ if (ipa_pass->type == IPA_PASS - && ((!pass->has_gate) || pass->gate ()) + && pass->gate () && ipa_pass->generate_summary) { pass_init_dump_file (pass); @@ -2153,7 +2141,7 @@ execute_one_pass (struct opt_pass *pass) /* Check whether gate check should be avoided. User controls the value of the gate through the parameter "gate_status". */ - gate_status = pass->has_gate ? pass->gate () : true; + gate_status = pass->gate (); gate_status = override_gate_status (pass, current_function_decl, gate_status); /* Override gate with plugin. */ @@ -2210,11 +2198,11 @@ execute_one_pass (struct opt_pass *pass) timevar_push (pass->tv_id); /* Do it! */ - if (pass->has_execute) -{ - todo_after = pass->execute (); - do_per_function (clear_last_verified, NULL); -} + todo_after = pass->execute (); + if (todo_after != TODO_absolutely_nothing) +do_per_function (clear_last_verified, NULL); + els
Re: RFC: simd enabled functions (omp declare simd / elementals)
Hi, On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote: > Hello gentlemen. I'm CCing all of you, because each of you can > provide valuable feedback to various parts of the compiler which I > touch. I have sprinkled love notes with your names throughout the > post :). sorry it took me so long, for various reasons I out of my control I've accumulated quite a backlog of email and tasks last week and it took me a lot of time to chew through it all. ... > [Martin] I have adapted the ipa_parm_adjustment infrastructure to > allow adding new arguments out of the blue like you mentioned was > missing in ipa-prop.h. I have also added support for creating > vectors of arguments. Could you take a look at my changes to > ipa-prop.[ch]? Sure, though I have only looked at ipa-* and tree-sra.c stuff. I do not have any real objections but would suggest a few amendments. I am glad this is becoming a useful infrastructure rather than just a part of IPA-SRA. Note that while ipa_combine_adjustments is not used from anywhere and thus probably buggy anyway, it should in theory be able to process new_param adjustments too. Can you please at least put a "not implemented" assert there? (The reason is that the plan still is to replace args_to_skip bitmaps in cgraphclones.c by adjustments one day and we do need to combine clones.) > > [Martin] I need to add new arguments in the case of inbranch clones, > which add an additional vector with a mask as the last argument: > For the following: > > #pragma omp declare simd simdlen(4) inbranch > int foo (int a) > { > return a + 1234; > } > > ...we would generate a clone with: > > vector(4) int > foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5) > > I thought it best to enhance ipa_modify_formal_parameters() and > associated machinery than to add the new argument ad-hoc. We > already have enough ways of doing tree and cgraph versioning in the > compiler ;-). > ... > gcc/ChangeLog.elementals > > * Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H. > * tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when > a SIMD clone may be available. > (vectorizable_function): Use SIMD clone if available. > * ipa-cp.c (determine_versionability): Nodes with SIMD clones are > not versionable. > * ggc.h (ggc_alloc_cleared_simd_clone_stat): New. > * cgraph.h (enum linear_stride_type): New. > (struct simd_clone_arg): New. > (struct simd_clone): New. > (struct cgraph_node): Add simdclone and simdclone_of fields. > (get_simd_clone): Protoize. > * cgraph.c (get_simd_clone): New. > Add `has_simd_clones' field. > * ipa-cp.c (determine_versionability): Disallow functions with > simd clones. (This looks like a repeated entry.) > * ipa-prop.h (ipa_sra_modify_function_body): Protoize. > (sra_ipa_modify_expr): Same. > (struct ipa_parm_adjustment): Add new_arg_prefix and new_param > fields. Document their use. > * ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand > new parameters and minor cleanups. > * omp-low.c: Add new pass_omp_simd_clone support code. > (make_pass_omp_simd_clone): New. > (pass_data_omp_simd_clone): Declare. > (class pass_omp_simd_clone): Declare. > (vecsize_mangle): New. > (ipa_omp_simd_clone): New. > (simd_clone_clauses_extract): New. > (simd_clone_compute_base_data_type): New. > (simd_clone_compute_vecsize_and_simdlen): New. > (simd_clone_create): New. > (simd_clone_adjust_return_type): New. > (simd_clone_adjust_return_types): New. > (simd_clone_adjust): New. > (simd_clone_init_simd_arrays): New. > (ipa_simd_modify_function_body): New. > (simd_clone_mangle): New. > (simd_clone_struct_alloc): New. > (simd_clone_struct_copy): New. > (class argno_map): New. > (argno_map::argno_map(tree)): New. > (argno_map::~argno_map): New. > (argno_map::operator []): New. > (argno_map::length): New. > (expand_simd_clones): New. > (create_tmp_simd_array): New. > * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New. > * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document. > * tree-pass.h (make_pass_omp_simd_clone): New. > * passes.def (pass_omp_simd_clone): New. > * target.def: Define new hook prefix "TARGET_CILKPLUS_". > (default_vecsize_mangle): New. > (vecsize_for_mangle): New. > * doc/tm.texi.in: Add placeholder for > TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and > TARGET_CILKPLUS_VECSIZE_FOR_MANGLE. > * tree-sra.c (sra_ipa_modify_expr): Remove static modifier. > (ipa_sra_modify_function_body): Same. > * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define. > * doc/tm.texi: Regenerate. > * config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New. >
Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)
On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote: > On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek wrote: > > Here, forward propagation turned > > _6 = a.1_5 & 1; > > _7 = (_Bool) _6; > > into > > _7 = (_Bool) a.1_5; > > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is > > 1. > > ? > > (_Bool) 2 > > should truncate the value to zero. Aha. I can see now, said the blind man. Anyway, I'll revisit this in stage 3; I now suspect the special handling of BIT_IOR_EXPR in register_edge_assert_for_1. Marek
Re: Implement C11 _Atomic
On Wed, Nov 6, 2013 at 12:21 AM, Joseph S. Myers wrote: > This patch, relative to trunk and based on work done on the C11-atomic > branch, adds support for C11 _Atomic. It is intended to include all > the required language support. > > It does not include the header; there's a version on the > branch, but it needs further review against the standard and test > coverage adding to the testsuite before I can propose it for mainline. > > Support for atomic types having bigger alignment than the > corresponding non-atomic types is limited: it includes the code to > increase the alignment of types whose size is exactly 1, 2, 4, 8 or 16 > to that of the corresponding integer type [*], but not anything for > target-specific alignment increases. There's code for target-specific > alignment on the branch (and I intend to merge trunk back to the > branch once this patch is on trunk, so it's easy to tell what the > changes still left on the branch are), should any target maintainers > wish to have such alignment. Note however that ideally libstdc++ > atomics would be ABI-compatible with C atomics, requiring them to get > the same alignment; the branch has an "atomic" attribute for that > purpose, but I think further work on the C++ front-end changes would > be needed for them to be ready for mainline. > > [*] The c-common.c code to resolve size-generic atomic built-in > functions to size-specific ones assumes that types are naturally > aligned (and so resolves to calls to the size-specific functions that > require natural alignment) without checking it. If users should be > able to use the size-generic functions on types with lesser alignment, > e.g. _Complex double (8-byte rather than 16-byte aligned), rather than > just on their _Atomic variants, this is of course a bug in the code > resolving those built-in functions. (The libatomic functions properly > check for alignment at runtime.) But it seems reasonable enough > anyway to make _Atomic _Complex double 16-byte aligned. > > Full use of _Atomic requires linking with libatomic, if you're doing > any atomic operations that aren't fully expanded inline; arguably > libatomic should be linked in by default with --as-needed (given that > previously ordinary C code didn't require the user to link any > libraries explicitly unless they directly called library functions), > but that's independent of this patch. Correct handling of atomic > compound assignment for floating-point types also requires built-in > support for floating-point environment manipulation operations roughly > equivalent to feholdexcept, feclearexcept and feupdateenv (built-ins > are used to avoid introducing libm dependencies, which generic C code > not actually calling libm library functions shouldn't require); this > patch includes such support for x86 [*], and I expect other > architectures to be simpler. If you don't have a libatomic port, the > execution tests for _Atomic simply won't be run; if you have libatomic > but not the floating-point support, the tests will be run but > c11-atomic-exec-5.c will fail (assuming the library features are > present for that test to run at all - it also requires pthreads). > > [*] This could be optimized if desired by passing the types in > question to the target hook so it can generate code for only one of > x87 and SSE in most cases, much like an optimization present in glibc > when it internally does feholdexcept / fesetenv / feupdateenv > sequences. > > _Atomic support is currently disabled for Objective-C and OpenMP. For > both (but mainly OpenMP), the relevant parser code needs checking to > determine where convert_lvalue_to_rvalue calls need inserting to > ensure that accesses to atomic variables involve atomic loads. For > Objective-C, there are also various special cases of compound > assignment that need special handling for atomics just as standard C > compound assignment is handled differently for atomics, as well as > some TYPE_MAIN_VARIANT calls to check for correctness for atomics; see > the comment on the relevant sorry () call for details. OpenMP should > also have TYPE_MAIN_VARIANT uses checked as well as a use of > TYPE_QUALS_NO_ADDR_SPACE for a diagnostic in > c_parser_omp_declare_reduction (where the diagnostic refers to a > particular list of qualifiers). > > Bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to > commit (the various non-front-end pieces)? > > gcc: > 2013-11-05 Andrew MacLeod > Joseph Myers > > * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC. > (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE, > TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE. > (struct tree_base): Add atomic_flag field. > * tree.h (TYPE_ATOMIC): New accessor macro. > (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC. > (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro. > (atomicQI_type_node, atomicHI_type_node, at
Clean up atomic tests
This patch cleans up various issues with the tests of atomics built-in functions and libatomic functions, in preparation for adapting those tests to add test coverage of stdatomic.h macros. The tests were missing a return type for main (C11 doesn't allow implicit int return types). Some tests were incrementing a variable in one part of an expression and using its value in another part without a sequence point in between. And the libatomic tests shouldn't have been restricted to targets with hardware sync_* support, or adding command-line options to make such support available, since they are built with -fno-inline-atomics anyway to ensure the library functionality is what's tested, and the library is meant to cover all types regardless of what hardware support may be available. Tested for x86_64-unknown-linux-gnu. OK to commit? gcc/testsuite: 2013-11-07 Joseph Myers * gcc.dg/atomic-compare-exchange-1.c, gcc.dg/atomic-compare-exchange-2.c, gcc.dg/atomic-compare-exchange-3.c, gcc.dg/atomic-compare-exchange-4.c, gcc.dg/atomic-compare-exchange-5.c, gcc.dg/atomic-exchange-1.c, gcc.dg/atomic-exchange-2.c, gcc.dg/atomic-exchange-3.c, gcc.dg/atomic-exchange-4.c, gcc.dg/atomic-exchange-5.c, gcc.dg/atomic-fence.c, gcc.dg/atomic-flag.c, gcc.dg/atomic-generic.c, gcc.dg/atomic-invalid.c, gcc.dg/atomic-load-1.c, gcc.dg/atomic-load-2.c, gcc.dg/atomic-load-3.c, gcc.dg/atomic-load-4.c, gcc.dg/atomic-load-5.c, gcc.dg/atomic-lockfree.c, gcc.dg/atomic-noinline.c, gcc.dg/atomic-op-1.c, gcc.dg/atomic-op-2.c, gcc.dg/atomic-op-3.c, gcc.dg/atomic-op-4.c, gcc.dg/atomic-op-5.c, gcc.dg/atomic-param.c, gcc.dg/atomic-store-1.c, gcc.dg/atomic-store-2.c, gcc.dg/atomic-store-3.c, gcc.dg/atomic-store-4.c, gcc.dg/atomic-store-5.c: Declare main as returning int. * gcc.dg/atomic-exchange-1.c, gcc.dg/atomic-exchange-2.c, gcc.dg/atomic-exchange-3.c, gcc.dg/atomic-exchange-4.c, gcc.dg/atomic-exchange-5.c: Separate increments of count from expression using value of count. libatomic: 2013-11-07 Joseph Myers * testsuite/libatomic.c/atomic-compare-exchange-1.c, testsuite/libatomic.c/atomic-compare-exchange-2.c, testsuite/libatomic.c/atomic-compare-exchange-3.c, testsuite/libatomic.c/atomic-compare-exchange-4.c, testsuite/libatomic.c/atomic-compare-exchange-5.c, testsuite/libatomic.c/atomic-exchange-1.c, testsuite/libatomic.c/atomic-exchange-2.c, testsuite/libatomic.c/atomic-exchange-3.c, testsuite/libatomic.c/atomic-exchange-4.c, testsuite/libatomic.c/atomic-exchange-5.c, testsuite/libatomic.c/atomic-generic.c, testsuite/libatomic.c/atomic-load-1.c, testsuite/libatomic.c/atomic-load-2.c, testsuite/libatomic.c/atomic-load-3.c, testsuite/libatomic.c/atomic-load-4.c, testsuite/libatomic.c/atomic-load-5.c, testsuite/libatomic.c/atomic-op-1.c, testsuite/libatomic.c/atomic-op-2.c, testsuite/libatomic.c/atomic-op-3.c, testsuite/libatomic.c/atomic-op-4.c, testsuite/libatomic.c/atomic-op-5.c, testsuite/libatomic.c/atomic-store-1.c, testsuite/libatomic.c/atomic-store-2.c, testsuite/libatomic.c/atomic-store-3.c, testsuite/libatomic.c/atomic-store-4.c, testsuite/libatomic.c/atomic-store-5.c: Declare main as returning int. Do not require built-in sync support or add target-specific options. * testsuite/libatomic.c/atomic-exchange-1.c, testsuite/libatomic.c/atomic-exchange-2.c, testsuite/libatomic.c/atomic-exchange-3.c, testsuite/libatomic.c/atomic-exchange-4.c, testsuite/libatomic.c/atomic-exchange-5.c: Separate increments of count from expression using value of count. Index: gcc/testsuite/gcc.dg/atomic-op-2.c === --- gcc/testsuite/gcc.dg/atomic-op-2.c (revision 204508) +++ gcc/testsuite/gcc.dg/atomic-op-2.c (working copy) @@ -528,6 +528,7 @@ abort (); } +int main () { test_fetch_add (); Index: gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c === --- gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(revision 204508) +++ gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(working copy) @@ -16,6 +16,7 @@ #define STRONG 0 #define WEAK 1 +int main () { Index: gcc/testsuite/gcc.dg/atomic-exchange-3.c === --- gcc/testsuite/gcc.dg/atomic-exchange-3.c(revision 204508) +++ gcc/testsuite/gcc.dg/atomic-exchange-3.c(working copy) @@ -9,25 +9,31 @@ int v, count, ret; +int main () { v = 0; count = 0; - if (__atomic_exchange_n (&v, count + 1, __ATOMIC_RELAXED) != count++) + if (_
Re: Implement C11 _Atomic
On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote: > 2013-11-05 Andrew MacLeod > Joseph Myers > > * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC. > (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE, > TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE. > (struct tree_base): Add atomic_flag field. > * tree.h (TYPE_ATOMIC): New accessor macro. > (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC. > (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro. > (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node) > (atomicDI_type_node, atomicTI_type_node): New macros for type > nodes. > * tree.c (set_type_quals): Set TYPE_ATOMIC. > (find_atomic_core_type): New function. > (build_qualified_type): Adjust alignment for qualified types. > (build_atomic_base): New function > (build_common_tree_nodes): Build atomicQI_type_node, > atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and > atomicTI_type_node. > * print-tree.c (print_node): Print atomic qualifier. > * tree-pretty-print.c (dump_generic_node): Print atomic type > attribute. > * target.def (atomic_assign_expand_fenv): New hook. > * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook. > * doc/tm.texi: Regenerate. > * targhooks.c (default_atomic_assign_expand_fenv): New function. > * targhooks.h (default_atomic_assign_expand_fenv): Declare. > * sync-builtins.def (__atomic_feraiseexcept): New built-in > function. > 2013-11-05 Joseph Myers > > * lib/target-supports.exp > (check_effective_target_fenv_exceptions): New function. > * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files. > * gcc.dg/atomic/c11-atomic-exec-1.c, > gcc.dg/atomic/c11-atomic-exec-2.c, > gcc.dg/atomic/c11-atomic-exec-3.c, > gcc.dg/atomic/c11-atomic-exec-4.c, > gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c, > gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c, > gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests. > > libatomic: > 2013-11-05 Joseph Myers > > * fenv.c: New file. > * libatomic.map (LIBATOMIC_1.1): New symbol version. Include > __atomic_feraiseexcept. > * configure.ac (libtool_VERSION): Change to 2:0:1. > (fenv.h): Test for header. > * Makefile.am (libatomic_la_SOURCES): Add fenv.c. > * Makefile.in, auto-config.h.in, configure: Regenerate. The middle-end, libatomic and testsuite changes are ok. Jakub
Re: Cilk Library
On 11/07/13 06:11, Thomas Schwinge wrote: Hi! On Wed, 9 Oct 2013 18:32:11 +, "Iyer, Balaji V" wrote: * Makefile.def: Add libcilkrts to target_modules. Make libcilkrts depend on libstdc++ and libgcc. [...] * Makefile.in: Added libcilkrts related fields to support building it. How did you modify the latter file? I noticed it is no longer in sync with the former: if I regenerate it (»autogen Makefile.def«), then there are differences. This is easily fixed, of course, but as everyone now has been using the "out-of-sync" Makefile.in, I wonder whether the regeneration qualifies as obvious to commit, or rather something in Makefile.def needs to be changed to make it match the Makefile.in as it has been checked in in r204173? * Makefile.in: Regenerate. I think we should consider regeneration as an obvious change. If that breaks something, then it's the original author who introduced the change without a corresponding regenerate that needs to fix their code. Just to be explicit, this is fine for the trunk. Jeff
Re: Re-factor tree.h - Part 1
On 11/07/13 10:24, Diego Novillo wrote: On Thu, Nov 7, 2013 at 5:29 AM, Richard Biener wrote: Moved to fold-const.c: size_low_cst Same as int_cst_value just with lack of any checking? It may be. It's used by the POINTER_EXPR_PLUS handler. I will try to replace it in a later patch. Moved to gimple-fold.c: truth_type_for (maybe keep extern?) Uses a langhook so should stay here. Done. Moved to tree-ssa-dom.c: iterative_hash_exprs_commutative I'm not sure - it seems that iterative_hash_expr does exactly the same thing? So instead of moving things this should be refactored (make that function re-usable from iterative_hash_expr). Future patch? Making it static will at least prevent diverging uses of the similar functions. I think future patch. In theory we ought to just make iterative_hash_exprs_commutative just go away, but there's some restructuring of the DOM code that would need to happen first. jeff
Committed: arc_ifcvt: take commutativity into account
tmp Description: Binary data
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Thu, 7 Nov 2013, Rong Xu wrote: > Thanks Joseph for these detailed comments/suggestions. > The fixed patch is attached to this email. > The only thing left out is the Texinfo manual. Do you mean this tool > should have its it's own texi file in gcc/doc? Its own texi file, probably included as a chapter by gcc.texi (like gcov.texi is). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] make has_gate and has_execute useless
On 11/07/13 09:00, tsaund...@mozilla.com wrote: From: Trevor Saunders Hi, This is the result of seeing what it would take to get rid of the has_gate and has_execute flags on pass_data. It turns out not much, but I wanted confirmation this part is ok before I go deal with all the places that initialize the fields. Most definitely, the has_gate/has_execute stuff is definitely a wart. I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite regressions (ignoring the silk stuff because the full paths in its test names break my test script for now) Any reason this patch with the actual removal of the flags wouldn't be ok? I don't see anything glaringly wrong. Any reason why the register_dump_files bits were mucking with properties. That just seems wrong. + todo_after = pass->execute (); + if (todo_after != TODO_absolutely_nothing) +do_per_function (clear_last_verified, NULL); + else +todo_after &= ~TODO_absolutely_nothing; Isn't the last assignment there just todo_after = 0; ? Jeff
[wide-int, committed] Fix nds32 build
Get the new port building on wide-int. Seemed pretty obvious so I went ahead and installed it. Thanks, Richard Index: gcc/config/nds32/nds32.c === --- gcc/config/nds32/nds32.c2013-11-05 13:06:54.744239262 + +++ gcc/config/nds32/nds32.c2013-11-07 10:27:03.339028225 + @@ -1273,7 +1273,7 @@ nds32_construct_isr_vectors_information /* Pick up each vector id value. */ id = TREE_VALUE (id_list); /* Add vector_number_offset to get actual vector number. */ - vector_id = TREE_INT_CST_LOW (id) + vector_number_offset; + vector_id = tree_to_uhwi (id) + vector_number_offset; /* Enable corresponding vector and set function name. */ nds32_isr_vectors[vector_id].category = (intr) @@ -1315,7 +1315,7 @@ nds32_construct_isr_vectors_information /* The total vectors = interrupt + exception numbers + reset. There are 8 exception and 1 reset in nds32 architecture. */ - nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1; + nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1; strcpy (nds32_isr_vectors[0].func_name, func_name); /* Retrieve nmi and warm function. */ @@ -3145,8 +3145,8 @@ nds32_insert_attributes (tree decl, tree id = TREE_VALUE (id_list); /* Issue error if it is not a valid integer value. */ if (TREE_CODE (id) != INTEGER_CST - || TREE_INT_CST_LOW (id) < lower_bound - || TREE_INT_CST_LOW (id) > upper_bound) + || wi::ltu_p (id, lower_bound) + || wi::gtu_p (id, upper_bound)) error ("invalid id value for interrupt/exception attribute"); /* Advance to next id. */ @@ -3173,8 +3173,8 @@ nds32_insert_attributes (tree decl, tree /* 3. Check valid integer value for reset. */ if (TREE_CODE (id) != INTEGER_CST - || TREE_INT_CST_LOW (id) < lower_bound - || TREE_INT_CST_LOW (id) > upper_bound) + || wi::ltu_p (id, lower_bound) + || wi::gtu_p (id, upper_bound)) error ("invalid id value for reset attribute"); /* 4. Check valid function for nmi/warm. */
Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise
On 11/06/13 15:29, Ian Lance Taylor wrote: When fenv.h is not fixed, libquadmath does not build. This patch works around the problem. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for mainline? Hi Ian, Yes, please. This time, I'm on my dev box and looked at the code. You remembered correctly that the first file name in the list of file names needs to not have wild card characters so that the testing scheme can create a file by that name. A directory named "*" is possible, but inconvenient. Anyway, it also matches prior art in: /* * glibc_c99_inline_3 */ fix = { hackname = glibc_c99_inline_3; files = bits/string2.h, '*/bits/string2.h'; but is inconsistent with: /* Some versions of glibc have a version of bits/string2.h that produces "value computed is not used" warnings from strncpy; fix this definition by using __builtin_strncpy instead as in newer versions. */ fix = { hackname = glibc_strncpy; files = bits/string2.h; Hmmm. Does that fix need fixing, too? Thanks -Bruce
wide-int testing results
I wanted to make sure that each backend still builds with wide-int and that there weren't any unexplained changes in assembly output. I arbitrarily picked one target for each CPU: aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf x86_64-darwin and built gcc and g++ from both the merge point and the wide-int branch. The branch included the patches I've posted and a local change to put CONST_WIDE_INT at the end of rtl.def (just for comparison, as mentioned before). I then compiled gcc.c-torture, gcc.dg and g++.dg at -O2 and compared the asm output. This obviously isn't a very strong test, since none of the libraries are built, and since some of the test cases rely on system headers, but it should at least be better than nothing. pdp11 and picochip-elf don't build on mainline due to: gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not declared in this scope # define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor Both the merge point and the branch built the other targets and there were no new warnings on the branch. The only asm differences were in: * gcc.dg/fixed-point/const-1.s on targets that support it I think this is a known improvement. The test case checks a series of conversions to make sure that the out-of-range ones produce a warning. E.g.: short _Fract sfF = 1.0; /* { dg-warning "overflow" } */ The asm differences are in the values produced for these out-of-range casees. The old code used real_to_integer2, which saturates the result to double_int precision. It then uses the fractional part of that result to initialise sfF. The result therefore depends on the host (at least in the general case). The new code instead saturates to the precision of the result, just like we already do for: signed short s = 32768.0f; So this seems like a good change. * gcc.dg/vect/pr51799.c on m32c-elf and * gcc.c-torture/execute/divconst-2.c on xstormy16-elf These are cases where wide-int converts a narrower-than-HWI multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but mainline doesn't. The mainline code looks like: /* Convert multiply by constant power of two into shift unless we are still generating RTL. This test is a kludge. */ if (CONST_INT_P (trueop1) && (val = exact_log2 (UINTVAL (trueop1))) >= 0 /* If the mode is larger than the host word size, and the uppermost bit is set, then this isn't a power of two due to implicit sign extension. */ && (width <= HOST_BITS_PER_WIDE_INT || val != HOST_BITS_PER_WIDE_INT - 1)) return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); and this exact_log2 doesn't cope properly with the sign-extended 0xff8000... CONST_INT. The wide-int code is: /* Convert multiply by constant power of two into shift. */ if (CONST_SCALAR_INT_P (trueop1)) { val = wi::exact_log2 (std::make_pair (trueop1, mode)); if (val >= 0 && val < GET_MODE_BITSIZE (mode)) return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); } So this too seems like a good thing. Thanks, Richard
Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise
So is this the right patch? $ svn diff inclhack.def Index: inclhack.def === --- inclhack.def(revision 204533) +++ inclhack.def(working copy) @@ -1738,7 +1738,7 @@ versions. */ fix = { hackname = glibc_strncpy; -files = bits/string2.h; +files = bits/string2.h, '*/bits/string2.h'; bypass= "__builtin_strncpy"; c_fix = format; c_fix_arg = "# define strncpy(dest, src, n) __builtin_strncpy (dest, src, n)"; @@ -2411,7 +2411,7 @@ */ fix = { hackname = huge_val_hex; -files = bits/huge_val.h; +files = bits/huge_val.h, '*/bits/huge_val.h'; select= "^#[ \t]*define[ \t]*HUGE_VAL[ \t].*0x1\\.0p.*"; bypass= "__builtin_huge_val"; @@ -2426,7 +2426,7 @@ */ fix = { hackname = huge_valf_hex; -files = bits/huge_val.h; +files = bits/huge_val.h, '*/bits/huge_val.h'; select= "^#[ \t]*define[ \t]*HUGE_VALF[ \t].*0x1\\.0p.*"; bypass= "__builtin_huge_valf"; @@ -2441,7 +2441,7 @@ */ fix = { hackname = huge_vall_hex; -files = bits/huge_val.h; +files = bits/huge_val.h, '*/bits/huge_val.h'; select= "^#[ \t]*define[ \t]*HUGE_VALL[ \t].*0x1\\.0p.*"; bypass= "__builtin_huge_vall"; @@ -4226,7 +4226,7 @@ fix = { hackname = thread_keyword; files = "pthread.h"; -files = "bits/sigthread.h"; +files = bits/sigthread.h, '*/bits/sigthread.h'; select= "([* ])__thread([,)])"; c_fix = format; c_fix_arg = "%1__thr%2"; @@ -4767,7 +4767,7 @@ fix = { hackname = feraiseexcept_nosse_invalid; mach = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*'; -files = bits/fenv.h; +files = bits/fenv.h, '*/bits/fenv.h'; select= "^([\t ]*)__asm__ __volatile__ \\(\"divss %0, %0 *\" : " ": \"x\" \\(__f\\)\\);$"; bypass= "\"fdiv .*; fwait\""; @@ -4794,7 +4794,7 @@ fix = { hackname = feraiseexcept_nosse_divbyzero; mach = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*'; -files = bits/fenv.h; +files = bits/fenv.h, '*/bits/fenv.h'; select= "^([\t ]*)__asm__ __volatile__ \\(\"divss %1, %0 *\" : " ": \"x\" \\(__f\\), \"x\" \\(__g\\)\\);$"; bypass= "\"fdivp .*; fwait\"";
Re: [Patch, RTL] Eliminate redundant vec_select moves.
Richard Sandiford wrote: Tejas Belagod writes: Richard Sandiford wrote: Tejas Belagod writes: Richard Sandiford wrote: Tejas Belagod writes: Richard Sandiford wrote: Tejas Belagod writes: + /* This is big-endian-safe because the elements are kept in target + memory order. So, for eg. PARALLEL element value of 2 is the same in + either endian-ness. */ + if (GET_CODE (src) == VEC_SELECT + && REG_P (XEXP (src, 0)) && REG_P (dst) + && REGNO (XEXP (src, 0)) == REGNO (dst)) +{ + rtx par = XEXP (src, 1); + int i; + + for (i = 0; i < XVECLEN (par, 0); i++) + { + rtx tem = XVECEXP (par, 0, i); + if (!CONST_INT_P (tem) || INTVAL (tem) != i) + return 0; + } + return 1; +} + I think for big endian it needs to be: INTVAL (tem) != i + base where base is something like: int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0); E.g. a big-endian V4HI looks like: msb lsb and shortening it to say V2HI only gives the low 32 bits: msb lsb But, in this case we want msb lsb It depends on whether the result occupies a full register or not. I was thinking of the case where it didn't, but I realise now you were thinking of the case where it did. And yeah, my suggestion doesn't cope with that... I was under the impression that the const vector parallel for vec_select represents the element indexes of the array in memory order. Therefore, in bigendian, msb lsb element a[0] a[1] a[2] a[3] and in littleendian msb lsb element a[3] a[2] a[1] a[0] Right. But if an N-bit value is stored in a register, it's assumed to occupy the lsb of the register and the N-1 bits above that. The other bits in the register are don't-care. E.g., leaving vectors to one side, if you have: (set (reg:HI N) (truncate:SI (reg:SI N))) on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this: msb lsb 01234567 4567 rather than: msb lsb 01234567 0123 for both endiannesses. The same principle applies to vectors. The lsb of the register is always assumed to be significant. So maybe the original patch was correct for partial-register and full-register results on little-endian, but only for full-register results on big-endian. Ah, ok! I think I get it. By eliminating set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] using the check INTVAL (tem) != i, I'm essentially making subsequent operations use (reg:V2DI n) in DI mode which is a partial register result and this gives me the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial register mode, I have to make sure the correct elements coincide with the lsb in big-endian... Thanks for your input, I'll apply the offset correction for big-endian you suggested. I'll respin the patch. Thanks. Just for avoidance of doubt, the result might be a full or partial register, depending on the mode and target. I was trying to correct myself by agreeing that your original was right and mine was wrong for big-endian if the result is a full register. What I had in mind when I implemented this was a partial-reg result, but obviously it was wrong. Sorry, I'm going to take a step back - I'm trying to figure out what a full register result would look like. Looking at the pattern, set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0] the result is always a mode smaller than the src if the vec_select selects a subset of lanes. Plus if the src and dst are the same reg, because we're re-writing the src reg, wouldn't it always end up being a partial-reg? In this case, wouldn't (reg:DI n) always represent msb lsb The problem is that one reg rtx can span several hard registers. E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32), but it might instead represent two 32-bit registers (nos. 32 and 33). Obviously the latter's not very likely for vectors this small, but more likely for larger ones (including on NEON IIRC). So if we had 2 32-bit registers being treated as a V4HI, it would be: <--32--><--33--> msb lsb msb lsb <--32--> for big endian and: <--33--><--32--> msb lsb msb lsb <--32--> for little endian. Ah, ok, that makes things clearer. Thanks for that. I can't find any helper function that figures out if we're writing partial or full result regs. Would something like REGNO (src) == REGNO (dst) && HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1 be a sane check for partial result regs? Thanks, Tejas.
Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification
Hi, On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote: > On 11/06/13 15:48, Jakub Jelinek wrote: > >On Wed, Nov 06, 2013 at 03:24:40PM -0700, Aldy Hernandez wrote: > >>I have checked the following patch with the attached testcases that > >>were previously ICEing, and with a handful of handcrafted tests that > >>I checked manually (array references on lhs and rhs, vectors of > >>pointers, etc). > > > >I'd like Martin to eyeball the ipa changes eventually. > > Indeed! Likewise for all my previous changes to ipa-prop.[ch] and > tree-sra.c. Sorry for the delay. I'd just like to re-iterate one comment from my previous email which is that I do not think tree-sra.c should export any function to the outside world apart from the entry points of the passes (yes, there is already build_ref_for_offset which I admit is entirely my creation but that should be moved elswhere too). So please put sra_ipa_get_adjustment_candidate to ipa-prop.c. I see that it requires get_ssa_base_param to be moved there as well but since IPA-SRA uses it in different places, it would need exporting too, which would be weird because it does not really do anything with parameters. Since the function is so trivial, I would even suggest introducing another private copy to ipa-prop.c (and leaving the original without the new parameter). Alternatively, you can move the function to tree.c but for that it looks too specialized. Thanks, Martin > > >>+ if (!SSA_VAR_P (*tp)) > >>+{ > >>+ /* Make sure we treat subtrees as a RHS. This makes sure that > >>+when examining the `*foo' in *foo=x, the `foo' get treated as > >>+a use properly. */ > > > >I think it wouldn't hurt to e.g. do > > if (TYPE_P (*tp)) > > *walk_subtrees = 0; > >here (and drop ATTRIBUTE_UNUSED above. > > Done. > > > > >>+ wi->is_lhs = false; > >>+ wi->val_only = true; > >>+ return NULL_TREE; > >>+} > >>+ struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info; > >>+ struct ipa_parm_adjustment *cand > >>+= sra_ipa_get_adjustment_candidate (tp, NULL, info->adjustments, true); > >>+ if (!cand) > >>+return NULL_TREE; > >>+ > >>tree t = *tp; > >>+ tree repl = make_ssa_name (create_tmp_var (TREE_TYPE (t), NULL), NULL); > > > >Just do > > tree repl = make_ssa_name (TREE_TYPE (t), NULL); > >no need to create underlying vars since 4.8. > > Done. > > > > >>+ /* We have modified in place; update the SSA operands. */ > >>+ info->modified = false; > > > >So you always set modified to false? I was expecting you'd set > >it to true here and defer update_stmt and maybe_clean_eh_stmt etc. > >to after walk_gimple_op (so, do it only when all the changes on the stmt > >have been performed). Plus, when you modify something, there is no need > >to walk subtrees, so you can just do *walk_subtrees = 0; too. > > Hmmm, good point. I've moved update_stmt and company to the caller, > and modified the caller to call regimplify_operands only for > GIMPLE_RETURN which is the special case. > > Let me know if this is still ok so I can commit. > > Thanks. > Aldy > gcc/ChangeLog.gomp > * ipa-prop.h (sra_ipa_get_adjustment_candidate): Protoize. > * omp-low.c (struct modify_stmt_info): New. > (ipa_simd_modify_function_body_ops_1): Remove. > (ipa_simd_modify_stmt_ops): New. > (ipa_simd_modify_function_body_ops): Remove. > (ipa_simd_modify_function_body): Set new callback info. > Remove special casing. Handle all operators with walk_gimple_op. > * tree-sra.c (get_ssa_base_param): Add new argument. Use it. > (disqualify_base_of_expr): Pass new argument to > get_ssa_base_param. > (sra_ipa_modify_expr): Abstract candidate search into... > (sra_ipa_get_adjustment_candidate): ...here. >
Re: wide-int testing results
I doubt that this list is comprehensive. When i many of these things a long time ago, I was not that interested in bit for bit compatibly as long as was never introducing any problems.in some cases it would have been hard to replicate some of the end cases with the wide-int code. On 11/07/2013 12:51 PM, Richard Sandiford wrote: I wanted to make sure that each backend still builds with wide-int and that there weren't any unexplained changes in assembly output. I arbitrarily picked one target for each CPU: aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf x86_64-darwin and built gcc and g++ from both the merge point and the wide-int branch. The branch included the patches I've posted and a local change to put CONST_WIDE_INT at the end of rtl.def (just for comparison, as mentioned before). I then compiled gcc.c-torture, gcc.dg and g++.dg at -O2 and compared the asm output. This obviously isn't a very strong test, since none of the libraries are built, and since some of the test cases rely on system headers, but it should at least be better than nothing. pdp11 and picochip-elf don't build on mainline due to: gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not declared in this scope # define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor Both the merge point and the branch built the other targets and there were no new warnings on the branch. The only asm differences were in: * gcc.dg/fixed-point/const-1.s on targets that support it I think this is a known improvement. The test case checks a series of conversions to make sure that the out-of-range ones produce a warning. E.g.: short _Fract sfF = 1.0; /* { dg-warning "overflow" } */ The asm differences are in the values produced for these out-of-range casees. The old code used real_to_integer2, which saturates the result to double_int precision. It then uses the fractional part of that result to initialise sfF. The result therefore depends on the host (at least in the general case). The new code instead saturates to the precision of the result, just like we already do for: signed short s = 32768.0f; So this seems like a good change. * gcc.dg/vect/pr51799.c on m32c-elf and * gcc.c-torture/execute/divconst-2.c on xstormy16-elf These are cases where wide-int converts a narrower-than-HWI multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but mainline doesn't. The mainline code looks like: /* Convert multiply by constant power of two into shift unless we are still generating RTL. This test is a kludge. */ if (CONST_INT_P (trueop1) && (val = exact_log2 (UINTVAL (trueop1))) >= 0 /* If the mode is larger than the host word size, and the uppermost bit is set, then this isn't a power of two due to implicit sign extension. */ && (width <= HOST_BITS_PER_WIDE_INT || val != HOST_BITS_PER_WIDE_INT - 1)) return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); and this exact_log2 doesn't cope properly with the sign-extended 0xff8000... CONST_INT. The wide-int code is: /* Convert multiply by constant power of two into shift. */ if (CONST_SCALAR_INT_P (trueop1)) { val = wi::exact_log2 (std::make_pair (trueop1, mode)); if (val >= 0 && val < GET_MODE_BITSIZE (mode)) return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); } So this too seems like a good thing. Thanks, Richard
Re: Implement C11 _Atomic
On Thu, Nov 7, 2013 at 5:45 PM, Jakub Jelinek wrote: > On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote: >> 2013-11-05 Andrew MacLeod >> Joseph Myers >> >> * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC. >> (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE, >> TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE. >> (struct tree_base): Add atomic_flag field. >> * tree.h (TYPE_ATOMIC): New accessor macro. >> (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC. >> (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro. >> (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node) >> (atomicDI_type_node, atomicTI_type_node): New macros for type >> nodes. >> * tree.c (set_type_quals): Set TYPE_ATOMIC. >> (find_atomic_core_type): New function. >> (build_qualified_type): Adjust alignment for qualified types. >> (build_atomic_base): New function >> (build_common_tree_nodes): Build atomicQI_type_node, >> atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and >> atomicTI_type_node. >> * print-tree.c (print_node): Print atomic qualifier. >> * tree-pretty-print.c (dump_generic_node): Print atomic type >> attribute. >> * target.def (atomic_assign_expand_fenv): New hook. >> * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook. >> * doc/tm.texi: Regenerate. >> * targhooks.c (default_atomic_assign_expand_fenv): New function. >> * targhooks.h (default_atomic_assign_expand_fenv): Declare. >> * sync-builtins.def (__atomic_feraiseexcept): New built-in >> function. > >> 2013-11-05 Joseph Myers >> >> * lib/target-supports.exp >> (check_effective_target_fenv_exceptions): New function. >> * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files. >> * gcc.dg/atomic/c11-atomic-exec-1.c, >> gcc.dg/atomic/c11-atomic-exec-2.c, >> gcc.dg/atomic/c11-atomic-exec-3.c, >> gcc.dg/atomic/c11-atomic-exec-4.c, >> gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c, >> gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c, >> gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests. >> >> libatomic: >> 2013-11-05 Joseph Myers >> >> * fenv.c: New file. >> * libatomic.map (LIBATOMIC_1.1): New symbol version. Include >> __atomic_feraiseexcept. >> * configure.ac (libtool_VERSION): Change to 2:0:1. >> (fenv.h): Test for header. >> * Makefile.am (libatomic_la_SOURCES): Add fenv.c. >> * Makefile.in, auto-config.h.in, configure: Regenerate. > > The middle-end, libatomic and testsuite changes are ok. Please note that following code form fenv.c won't generate overflow exception on x87: if (excepts & FP_EX_OVERFLOW) { volatile float max = __FLT_MAX__; r = max * max; } Uros.
Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart
On Thu, Nov 7, 2013 at 8:23 AM, Andrew MacLeod wrote: > On 11/07/2013 05:36 AM, Basile Starynkevitch wrote: >> >> On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote: >>> >>> I decided to name the new file gimple-expr.[ch] instead of >>> gimple-decl This will eventually split into gimple-type.[ch], >>> gimple-decl.[ch], and gimple-expr.[ch]. >> >> >> Since we are adding *new* C++ files, can't we please name them *.cc >> for the implementation part, so at least create gimple-expr.h and >> gimple-expr.cc but not gimple-expr.c, please! > > Assuming we put this into stage 1 next year, I would imagine we'd rename a > number of things, including .cc, drop the tree-* from the tree-ssa-blah.[c]h > files, etc etc. There have been a few things people have suggested > renaming... I think if we do renaming, they ought to be all at one time to > minimize the pain. > > At the moment, the new gimple-* files I'm creating are still C, so they are > .c files... Agreed. When we start shuffling files around seems a better time. Doing both operations at once will be easier than going through two phases of naming changes. Diego.
Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level
Kenneth Zadeck writes: > I very strongly disagree with this. The standard needs to be high than > "does it pass the test suite." > > What we are introducing is a case where the program will behave one way > with optimization and another way without it. While, this is always > true for timing dependent code, it is pretty rare for math. We should > always tread carefully when doing that, and the excuse that it is > "cleaner" does not, in my mind, fit the bill. If it makes you feel any better (probably not), this was already true for !SHIFT_COUNT_TRUNCATED targets. The tree level would then not do any truncation, whereas as you said before, all real targets do in practice. And that affected many of the main targets, including x86_64, powerpc, z/Series, ARM, etc. (And the tree level has to do _something_. "unsigned int i = 1 << 32;" must compile with default options, even if the behaviour's undefined. Only rtl has the luxury of being able to punt.) Although you could go out of your way to try to make the undefined behaviour match what the target instructions would do, that'd be a lot of work, especially for cases that are implemented using library calls. And there's a lot to be said for making the behaviour consistent across targets, which is what removing the truncation from the tree level does. Things are different at the rtl level because there a backend can legitimately exploit the behaviour of the shift instructions when implementing optabs. And optabs.c itself does this when creating doubleword shifts. So at the rtl level we can only optimise out-of-range shifts if we're sure what the target would do. Thanks, Richard
RE: Initial submission of OpenACC support integrated into OpenMP's lowering and expansion passes
Hi, Thomas! This is really great! I've looked at your changes and in most of front-end parts it looks reasonable to me. As you know, we're like-minded with you about how OpenACC's front-ends should look like. So, I think it's good that you have working flow for your implementation. Now we can start to merge front-ends from openacc-1_0-branch to GOMP, while keeping all understandings of ACC in sync and working. Jakub, don't you mind if I prepare a few front-end patches likewise for review to extend Thomas' vision with the things that was already done by my colleagues on the ACC branch? I hope it'll much speed up the development. > the last patch adds GOACC_parallel, which so far simply branches to > GOMP_target. There's more to come. I've reviewed ACC-related code from [openacc-1_0-branch] and from the patches, I noticed there are some mess around marking OpenACC specific code. There are ACC, GACC, OACC, GOACC abbreviations used across code, which one is the best? I prefer ACC, this is not critical, I'm just curious how to resolve this. > Due to the conceptual similarity compared to OpenMP, and that (later) > it is reasonable to expect to embed OpenACC directives into OpenMP > ones, the approach I've chosen is to directly embed OpenACC handling > into the existing OpenMP infrastructure/passes. Regarding front-ends, not only from view of OMP/ACC, but front-ends itself, it makes sense to move some parsing routines out of gomp/acc context. This will be useful to implement not only ACC, but also HMPP, if someone is interested, or other similar technology. And of cause, these functions should be used behind the existing facilities - not to break current implementation and other dependencies. So it won't not introduce any difficulties for back porting, while generalizing front-ends a bit. For now, I prefer to extend the approved way of developing, but keep it in mind, please. > This patch series doesn't contain any substantial rumtime library work > yet; Can you share some technical details/ideas about further implementation, it's not going to be trivial. On the [openacc-1_0-branch] we resolved it with OpenCL-driven runtime. Also, OpenACC 2.0 standard declares some vendor-specific routines, this should be noted too. > This is in contrast to Samsung's work, who are implementing OpenACC > separately from the existing OpenMP support. I'm not fully agree that this in contrast to things are taking place in [openacc-1_0-branch]. The main reason for keeping middle-end and back-end solutions far from GOMP is the understanding of OpenACC semantics and unclear understanding of what should be done to generalize support of accelerators in GCC. OpenACC and OpenOMP has some semantically differences, but such things matters only for middle-end and further code generation. The main one, is the memory concept. The > Yet, I hope we'll be able to share/re-use at least front end and some middle end code. I absolutely agree, it's needed to merge our efforts regarding front-ends. Many things from ACC in [openacc-1_0-branch] is already done, so it doesn't necessary to re-implement same things. Maybe we can polish front-ends together and then commit changes to GOMP? > We directly strive for OpenACC 2.0 support, skipping OpenACC 1. We're > focussing on the C front end implementation first, following on with > C++ and Fortran later on. How do you think, does it make sense to adopt current Fortran implementation from the OpenACC branch? This can be done quite soon. Also, I may help to extend C and C++ front-end the same way it's done now, by adding other implemented directives, too. Regarding proposed patches: [8/9] c_parser_omp_all_clauses and cp_parser_omp_all_clauses - c_parser_error (parser, "expected %<#pragma omp%> clause"); + c_parser_error (parser, "expected clause"); - cp_parser_error (parser, "expected %<#pragma omp%> clause"); + cp_parser_error (parser, "expected clause"); Does it really needed to remove %<#pragma omp%> in these cases? We handle both, I think. [9/9] * gimple.h (gimple_build_oacc_*): New declaration. gimple_oacc_parallel_* and other functions like that. Let's move these functions out of gimple.* to gimple-oacc.*! There are going to be much more of the stuff like this, maybe it'll make sense to keep it externally? May be not only gimple manipulation functions but also pretty-print Also, have a look on openacc_1-0_branch please, for gimple-oacc.*, is OK if I'll add some of this? Handling clauses. + GIMPLE_CHECK (gs, GIMPLE_OACC_PARALLEL); + gs->gimple_omp_parallel.clauses = clauses; I think, we can freely add some kind of flag to gimple_omp_parallel to indicate that we're are working with ACC. Or even add new gimple_acc_* statements. On gimplification, lowering stuff and further, I think it's okay if targeting GOMP_target, but on my view, this is going to prevent us from emiting OpenCL or do some extended analysis of the
Re: Clean up atomic tests
On Nov 7, 2013, at 8:43 AM, Joseph S. Myers wrote: > This patch cleans up various issues with the tests of atomics built-in > functions and libatomic functions, in preparation for adapting those > tests to add test coverage of stdatomic.h macros. The tests were > missing a return type for main (C11 doesn't allow implicit int return > types). Ok for those changes. > Some tests were incrementing a variable in one part of an > expression and using its value in another part without a sequence > point in between. Ok for those changes. > And the libatomic tests shouldn't have been > restricted to targets with hardware sync_* support, or adding > command-line options to make such support available, For this I'd like a domain expert to weigh in. If that is you, then you can check it in as obvious if you'd like. If you want a review, you can wait for them to speak up.
Re: [wide-int, committed] Fix nds32 build
On Nov 7, 2013, at 9:47 AM, Richard Sandiford wrote: > Get the new port building on wide-int. Seemed pretty obvious so I went > ahead and installed it. Looks good.
Re: wide-int testing results
Kenneth Zadeck writes: > I doubt that this list is comprehensive. Hmm? It was supposed to be one target for each CPU, so if you think I've missed one, please point it out. It certainly wasn't supposed to be every target triple that gcc supports. Even one target per CPU was useful, since quite a few wouldn't compile. > When i many of these things a long time ago, I was not that interested > in bit for bit compatibly as long as was never introducing any > problems.in some cases it would have been hard to replicate some of > the end cases with the wide-int code. Not sure what you mean. The point was that if wide-int made things better, we should just say "good" and move on. Which is why I'm not trying to get bit-for-bit compatibility in the testcases I pointed out. But for a change of this size I think we should at least know why any differences occur. Thanks, Richard > On 11/07/2013 12:51 PM, Richard Sandiford wrote: >> I wanted to make sure that each backend still builds with wide-int and that >> there weren't any unexplained changes in assembly output. I arbitrarily >> picked one target for each CPU: >> >> aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi >> avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf >> fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf >> lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf >> microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf >> msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf >> powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu >> score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf >> tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf >> x86_64-darwin >> >> and built gcc and g++ from both the merge point and the wide-int branch. >> The branch included the patches I've posted and a local change >> to put CONST_WIDE_INT at the end of rtl.def (just for comparison, >> as mentioned before). I then compiled gcc.c-torture, gcc.dg and g++.dg >> at -O2 and compared the asm output. This obviously isn't a very strong >> test, since none of the libraries are built, and since some of the test >> cases rely on system headers, but it should at least be better than nothing. >> >> pdp11 and picochip-elf don't build on mainline due to: >> >> gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was >> not declared in this scope >> # define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor >> >> Both the merge point and the branch built the other targets and there were >> no new warnings on the branch. The only asm differences were in: >> >> * gcc.dg/fixed-point/const-1.s on targets that support it >> >>I think this is a known improvement. The test case checks a series of >>conversions to make sure that the out-of-range ones produce a warning. >>E.g.: >> >> short _Fract sfF = 1.0; /* { dg-warning "overflow" } */ >> >>The asm differences are in the values produced for these out-of-range >>casees. The old code used real_to_integer2, which saturates the result >>to double_int precision. It then uses the fractional part of that result >>to initialise sfF. The result therefore depends on the host (at least >>in the general case). >> >>The new code instead saturates to the precision of the result, just like >>we already do for: >> >> signed short s = 32768.0f; >> >>So this seems like a good change. >> >> * gcc.dg/vect/pr51799.c on m32c-elf and >> * gcc.c-torture/execute/divconst-2.c on xstormy16-elf >> >>These are cases where wide-int converts a narrower-than-HWI >>multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but >>mainline doesn't. The mainline code looks like: >> >>/* Convert multiply by constant power of two into shift unless >> we are still generating RTL. This test is a kludge. */ >>if (CONST_INT_P (trueop1) >>&& (val = exact_log2 (UINTVAL (trueop1))) >= 0 >>/* If the mode is larger than the host word size, and the >> uppermost bit is set, then this isn't a power of two due >> to implicit sign extension. */ >>&& (width <= HOST_BITS_PER_WIDE_INT >>|| val != HOST_BITS_PER_WIDE_INT - 1)) >> return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); >> >>and this exact_log2 doesn't cope properly with the sign-extended >>0xff8000... CONST_INT. The wide-int code is: >> >>/* Convert multiply by constant power of two into shift. */ >>if (CONST_SCALAR_INT_P (trueop1)) >> { >>val = wi::exact_log2 (std::make_pair (trueop1, mode)); >>if (val >= 0 && val < GET_MODE_BITSIZE (mode)) >> return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val)); >> } >> >>So this too seems like a good thing. >> >> Thanks, >> Richard
RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c
> On 11/07/13 09:32, Dodji Seketeli wrote: > >> From the above, what I can say is that input.o was already linked with >> gcov. But I think it's minimal enough to only drag libcpp and the >> diagnostic subsystem. > > I disagree. While input.o was available to gcov, I don't think it was being > pulled into the executable (if it was, I'd like to understand why). I think > dragging in libcpp & diagnostics is a major bloat -- have you measured the > before and after gcov text sizes? I agree that this should be avoided. However both versions of read_line need to be fixed eventually. Just for the records, this is not only a theoretical issue for gcov: The OOM error can also be reproduced in gcov. The attached test.c has some dozends of comment lines: //xx\0EOL That are lines with text before and after the \0. If compiled without -Wall, it will crash in gcov: gcc -fprofile-arcs -ftest-coverage test.c ./a.out gcov test File 'test.c' Lines executed:100.00% of 1 Creating 'test.c.gcov' gcov: out of memory allocating 1677721600 bytes after a total of 135168 bytes Bernd.//00 EOL //01 EOL //02 EOL //03 EOL //04 EOL //05 EOL //06 EOL //07 EOL //08 EOL //09 EOL //10 EOL //11 EOL //12 EOL //13 EOL //14 EOL //15 EOL //16 EOL //17 EOL //18 EOL //19 EOL //20 EOL //21 EOL //22 EOL //23 EOL //24 EOL //25 EOL //26 EOL //27 EOL //28 EOL //29 EOL //30 EOL //31 EOL //32 EOL //33 EOL //34 EOL //35 EOL //36 EOL //37 EOL //38 EOL //39 EOL //40 EOL //41 EOL //42 EOL int main() { } // EOF
Re: [PATCH] make has_gate and has_execute useless
On Thu, Nov 07, 2013 at 10:30:16AM -0700, Jeff Law wrote: > On 11/07/13 09:00, tsaund...@mozilla.com wrote: > >From: Trevor Saunders > > > >Hi, > > > > This is the result of seeing what it would take to get rid of the has_gate > > and > >has_execute flags on pass_data. It turns out not much, but I wanted > >confirmation this part is ok before I go deal with all the places that > >initialize the fields. > Most definitely, the has_gate/has_execute stuff is definitely a wart. > > > > >I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test > >suite > >regressions (ignoring the silk stuff because the full paths in its test names > >break my test script for now) Any reason this patch with the actual removal > >of the flags wouldn't be ok? > I don't see anything glaringly wrong. great, I'll work on ripping them out then :) > Any reason why the register_dump_files bits were mucking with > properties. That just seems wrong. no, it looks like its pretty old, the most recent change I can see is r110742 in 2006, but I agree it seems crazy, but atleast its not actually doing anything now. > >+ todo_after = pass->execute (); > >+ if (todo_after != TODO_absolutely_nothing) > >+do_per_function (clear_last_verified, NULL); > >+ else > >+todo_after &= ~TODO_absolutely_nothing; > > Isn't the last assignment there just > todo_after = 0; yes Trev > > ? > > Jeff