Re: [PATCH] Fix static libasan link
> Ok with those changes. >Jakub Thanks. Committed revision 204934. -Y
Re: Revert libsanitizer patches or fix 59009
On Sat, Nov 16, 2013 at 7:59 AM, John David Anglin wrote: > As far as I can tell, libsanitizer works on hppa-linux. So, the change > could be added to the llvm tree. > However, I'm unlikely to test anything in the tree unless someone tells me > there's something to test. Submitted the hppa patch as http://llvm.org/viewvc/llvm-project?view=revision&revision=194995. > > Dave > > > On 15-Nov-13, at 10:52 PM, Konstantin Serebryany wrote: > >> Dave, >> >> Do you want the asan/asan_linux.cc (# elif defined(__hppa__)) part to >> be in the llvm tree? >> >> --kcc >> >> On Sat, Nov 16, 2013 at 3:55 AM, John David Anglin >> wrote: >>> >>> On 15-Nov-13, at 9:51 AM, Jakub Jelinek wrote: >>> On Fri, Nov 15, 2013 at 08:16:47AM -0600, Peter Bergner wrote: > > > On Wed, 2013-11-13 at 11:25 -0600, Peter Bergner wrote: >> >> >> On Wed, 2013-11-13 at 00:49 +0100, Jakub Jelinek wrote: >>> >>> >>> 2013-11-12 Jakub Jelinek >>> >>>* sanitizer_common/sanitizer_platform_limits_linux.cc: >>> Temporarily >>>ifdef out almost the whole source. >>>* sanitizer_common/sanitizer_common_syscalls.inc: Likewise. >> >> >> >> That helps, but as Pat reported in the bugzilla, it still is failing. >> With the following patch, we can now bootstrap on powerpc64-linux. >> >> Is this ok for trunk? >> >> Does this help the other architectures that are failing for the same >> build error? > > > > Ok, Dave reported in PR59009 that my last patch still left a few build > problems on HPPA. Dave tested the patch below and confirmed this > cleans How can there be problems on HPPA? libsanitizer/configure.tgt says that hppa* is UNSUPPORTED, so libsanitizer should never be built there. Furthermore, it would be nice to understand why the sigaction is different. >>> >>> >>> >>> >>> Actually, it turns out I have had a patch in my tree enabling it. >>> >>> Dave >>> -- >>> John David Anglin dave.ang...@bell.net >>> >>> >> > > -- > John David Anglin dave.ang...@bell.net > > >
Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p
So, the validation is ok with this patch, I'm just not able to say if the original performance issue is still fixed with it. Could you check it Kyrylo ? Yvan 2013-11-17 Yvan Roux * config/arm/arm.md (store_minmaxsi): Use only when optimize_function_for_size_p. On 15 November 2013 15:59, Yvan Roux wrote: >> Sometimes 4 will be needed, since both original register values may >> remain live. > > Indeed. > >> However, I'm inclined to agree that while it should be possible to >> decide at the *function* level whether or not an insn is valid, doing so >> at the block level is probably unsafe. > > Ok, so the attached patch should fix the issue, its validation is ongoing.
Re: [PATCH, ARM, LRA] Fixed bootstrap failure in Thumb mode
Ping. On 7 November 2013 15:56, Yvan Roux wrote: > Hi, > > this patch fixed an LRA cycling due to secondary reload (Thumb mode). > Notice that this patch is a prerequisite to turn on LRA by default on > ARM. Bootstrapped on a9 and a15 without any regression in the > testsuite as LRA is off by default and with the regression reported in > the thread bellow when LRA is on. > > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html > > Thanks, > Yvan > > 2013-11-07 Yvan Roux > > * config/arm/arm.h (THUMB_SECONDARY_INPUT_RELOAD_CLASS): Return > NO_REGS > for LRA.
[Patch, Fortran, OOP] PR 59143: Bogus warning with array-valued type-bound procedure
Hi all, here is a straightforward patch to teach 'get_expr_storage_size' about type-bound procedures (which are handled internally as procedure-pointer components of the corresponding vtab). In that sense the patch should handle both TBPs as well as PPCs. Regtested on x86_64-unknown-linux-gnu. Ok for trunk? Cheers, Janus 2013-11-18 Janus Weil PR fortran/59143 * interface.c (get_expr_storage_size): Handle array-valued type-bound procedures. 2013-11-18 Janus Weil PR fortran/59143 * gfortran.dg/typebound_proc_30.f90: New. Index: gcc/fortran/interface.c === --- gcc/fortran/interface.c (revision 204897) +++ gcc/fortran/interface.c (working copy) @@ -2426,6 +2426,17 @@ get_expr_storage_size (gfc_expr *e) - mpz_get_si (ref->u.ar.as->lower[i]->value.integer)); } } + else if (ref->type == REF_COMPONENT && ref->u.c.component->attr.function + && ref->u.c.component->attr.proc_pointer + && ref->u.c.component->attr.dimension) + { + /* Array-valued procedure-pointer components. */ + gfc_array_spec *as = ref->u.c.component->as; + for (i = 0; i < as->rank; i++) + elements = elements + * (mpz_get_si (as->upper[i]->value.integer) + - mpz_get_si (as->lower[i]->value.integer) + 1L); + } } if (substrlen) ! { dg-do compile } ! ! PR 59143: [OOP] Bogus warning with array-valued type-bound procedure ! ! Contributed by Jürgen Reuter module phs_single type :: phs_single_t contains procedure, nopass :: decay_p end type contains subroutine evaluate (phs) class(phs_single_t) :: phs call func1 (phs%decay_p ()) end subroutine subroutine func1 (p) real :: p(2) end subroutine function decay_p () real :: decay_p(2) decay_p = 1. end function end module ! { dg-final { cleanup-modules "phs_single" } }
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
> Yes, it passed all tests with -m32, -mx32, -m64 on Linux/x86-64. > Installed on trunk. Is this OK to backport to 4.8 branch after > a few days? Actually there are two issues with your change: Using time_t for tv_nsec looks actually wrong, the definition on (my) linux is: struct timespec { __kernel_time_t tv_sec; /* seconds */ longtv_nsec;/* nanoseconds */ }; As you can see, the types for tv_sec and tv_nsec are different. So the change to timespec looks wrong to me. Also changing tv_nsec type breaks s-osinte-solaris-posix.ads which would need a similar change, although not if we change the above. Can you please follow up on that? In particular, what the definition of timespec on x32 linux? Shouldn't tv_nsec be back to a long as it was before? Looks to me this way. Arno
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
> Actually there are two issues with your change: > > Using time_t for tv_nsec looks actually wrong, the definition on (my) > linux is: > > struct timespec { > __kernel_time_t tv_sec; /* seconds */ > longtv_nsec;/* nanoseconds */ > }; > > As you can see, the types for tv_sec and tv_nsec are different. > > So the change to timespec looks wrong to me. In addition concerning s-taprop-linux.adb, the definition of struct timeval on linux is: /* A time value that is accurate to the nearest microsecond but also has a range of years. */ struct timeval { __time_t tv_sec;/* Seconds. */ __suseconds_t tv_usec; /* Microseconds. */ }; So again, we have another type (__suseconds_t), and NOT time_t or __time_t so it's really wrong for tv_nsec to use time_t. Arno
Re: [0/10] Replace host_integerp and tree_low_cst
Jeff Law writes: > On 11/16/13 05:53, Richard Sandiford wrote: >> After the patch that went in yesterday, all calls to host_integerp and >> tree_low_cst pass a constant "pos" argument. This series replaces each >> function with two separate ones: > [ ... ] > So I've almost entirely ignored the whole wide-int conversion discussion > and I suspect I'm not entirely alone. > > Can you briefly summarize what's y'all are trying to accomplish with the > wide-int changes? At the moment, we can only handle tree and rtl integer constants that fit in 2 HOST_WIDE_INTs. The idea is to remove that limit. E.g. things like OImode (used in a few ports) will become a first-class citizen, with all OImode values being representable. Besides that headline reason, there are various side benefits. E.g.: - All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in "infinite" precision, which isn't possible for 128-bit constants today. (I.e. there's no way to distinguish signed and unsigned 128-bit constants in a double_int.) - Wider-than-2-HWI intermediate results can be represented as a single integer. I'm told this is useful for VRP. (wide-int is mostly Kenny and Mike's work, I've just been butting in recently.) - rtl-level constant folding can use the same code to handle all combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT, on converted ports). At the moment we handle CONST_INT cases specially, and don't try as hard with CONST_DOUBLEs. Implementation-wise, it tries to make it so that the common single-HWI cases are still fast. Thanks, Richard
Re: Pass floating point values on powerpc64 as per ABI
Alan Modra writes: > diff -urp gcc3/libffi/testsuite/libffi.call/cls_double_va.c > gcc4/libffi/testsuite/libffi.call/cls_double_va.c > --- gcc3/libffi/testsuite/libffi.call/cls_double_va.c 2013-11-15 > 23:03:07.193964372 +1030 > +++ gcc4/libffi/testsuite/libffi.call/cls_double_va.c 2013-11-15 > 23:22:51.383884118 +1030 > @@ -38,26 +38,24 @@ int main (void) > > /* This printf call is variadic */ > CHECK(ffi_prep_cif_var(&cif, FFI_DEFAULT_ABI, 1, 2, &ffi_type_sint, > - arg_types) == FFI_OK); > +arg_types) == FFI_OK); > > args[0] = &format; > args[1] = &doubleArg; > args[2] = NULL; > > ffi_call(&cif, FFI_FN(printf), &res, args); > - // { dg-output "7.0" } > + /* { dg-output "7.0" } */ > printf("res: %d\n", (int) res); > - // { dg-output "\nres: 4" } > + /* { dg-output "\nres: 4" } */ > > - /* The call to cls_double_va_fn is static, so have to use a normal > prep_cif */ > - CHECK(ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 2, &ffi_type_sint, arg_types) > == FFI_OK); This breaks ia64: FAIL: libffi.call/cls_double_va.c -O0 -W -Wall output pattern test, is 7.0 res: 4 0.0 res: 4 , should match 7.0? res: 4? 7.0? res: 4 FAIL: libffi.call/cls_longdouble_va.c -O0 -W -Wall output pattern test, is 7.0 res: 4 0.0 res: 4 , should match 7.0? res: 4? 7.0? res: 4 Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
[AArch64] AArch64 SIMD Builtins Better Type Correctness.
Hi, This patch adds infrastructure to allow us to model the correct types for the intrinsics in arm_neon.h. Thus far we have relied on ugly casts between data types, and treated every intrinsic as taking 'signed' vectors. So we have intrinsics in arm_neon.h which look like: uint32x4_t vaddw_high_u16 (uint32x4_t __a, uint16x8_t __b) { return (uint32x4_t) __builtin_aarch64_uaddw2v8hi ((int32x4_t) __a, (int16x8_t) __b); } What we would really like is to remove these casts so we get something more like: uint32x4_t vaddw_high_u16 (uint32x4_t __a, uint16x8_t __b) { return __builtin_aarch64_uaddw2v8hi_uuu (__a, __b); } This is much easier on the eye, and less error prone. This patch adds some infrastructure to encode the type signatures for the functions and to construct vector types as we need them. I've tested the patch on aarch64-none-elf with no regressions. And put it up against an internal testsuite for Neon intrinsics with no problems. OK? Thanks, James --- gcc/ 2013-11-18 James Greenhalgh * gcc/config/aarch64/aarch64-builtins.c (aarch64_simd_itype): Remove. (aarch64_simd_builtin_datum): Remove itype, add qualifiers pointer. (VAR1): Use qualifiers. (aarch64_build_scalar_type): New. (aarch64_build_vector_type): Likewise. (aarch64_build_type): Likewise. (aarch64_init_simd_builtins): Refactor, remove special cases, consolidate main loop. (aarch64_simd_expand_args): Likewise. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 2f1a8d0..28f576d 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -81,57 +81,101 @@ enum aarch64_simd_builtin_type_mode #define UP(X) X##_UP -typedef enum +#define SIMD_MAX_BUILTIN_ARGS 5 + +enum aarch64_type_qualifiers { - AARCH64_SIMD_BINOP, - AARCH64_SIMD_TERNOP, - AARCH64_SIMD_QUADOP, - AARCH64_SIMD_UNOP, - AARCH64_SIMD_GETLANE, - AARCH64_SIMD_SETLANE, - AARCH64_SIMD_CREATE, - AARCH64_SIMD_DUP, - AARCH64_SIMD_DUPLANE, - AARCH64_SIMD_COMBINE, - AARCH64_SIMD_SPLIT, - AARCH64_SIMD_LANEMUL, - AARCH64_SIMD_LANEMULL, - AARCH64_SIMD_LANEMULH, - AARCH64_SIMD_LANEMAC, - AARCH64_SIMD_SCALARMUL, - AARCH64_SIMD_SCALARMULL, - AARCH64_SIMD_SCALARMULH, - AARCH64_SIMD_SCALARMAC, - AARCH64_SIMD_CONVERT, - AARCH64_SIMD_FIXCONV, - AARCH64_SIMD_SELECT, - AARCH64_SIMD_RESULTPAIR, - AARCH64_SIMD_REINTERP, - AARCH64_SIMD_VTBL, - AARCH64_SIMD_VTBX, - AARCH64_SIMD_LOAD1, - AARCH64_SIMD_LOAD1LANE, - AARCH64_SIMD_STORE1, - AARCH64_SIMD_STORE1LANE, - AARCH64_SIMD_LOADSTRUCT, - AARCH64_SIMD_LOADSTRUCTLANE, - AARCH64_SIMD_STORESTRUCT, - AARCH64_SIMD_STORESTRUCTLANE, - AARCH64_SIMD_LOGICBINOP, - AARCH64_SIMD_SHIFTINSERT, - AARCH64_SIMD_SHIFTIMM, - AARCH64_SIMD_SHIFTACC -} aarch64_simd_itype; + /* T foo. */ + qualifier_none = 0x0, + /* unsigned T foo. */ + qualifier_unsigned = 0x1, /* 1 << 0 */ + /* const T foo. */ + qualifier_const = 0x2, /* 1 << 1 */ + /* T *foo. */ + qualifier_pointer = 0x4, /* 1 << 2 */ + /* const T *foo. */ + qualifier_const_pointer = 0x6, /* qualifier_const | qualifier_pointer */ + /* Used when expanding arguments if an operand could + be an immediate. */ + qualifier_immediate = 0x8, /* 1 << 3 */ + qualifier_maybe_immediate = 0x10, /* 1 << 4 */ + /* void foo (...). */ + qualifier_void = 0x20, /* 1 << 5 */ + /* Some patterns may have internal operands, this qualifier is an + instruction to the initialisation code to skip this operand. */ + qualifier_internal = 0x40, /* 1 << 6 */ + /* Some builtins should use the T_*mode* encoded in a simd_builtin_datum + rather than using the type of the operand. */ + qualifier_map_mode = 0x80, /* 1 << 7 */ + /* qualifier_pointer | qualifier_map_mode */ + qualifier_pointer_map_mode = 0x84, + /* qualifier_const_pointer | qualifier_map_mode */ + qualifier_const_pointer_map_mode = 0x86 +}; typedef struct { const char *name; - const aarch64_simd_itype itype; enum aarch64_simd_builtin_type_mode mode; const enum insn_code code; unsigned int fcode; + enum aarch64_type_qualifiers *qualifiers; } aarch64_simd_builtin_datum; +static enum aarch64_type_qualifiers +aarch64_types_unop_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none }; +#define TYPES_UNOP (aarch64_types_unop_qualifiers) +#define TYPES_CREATE (aarch64_types_unop_qualifiers) +#define TYPES_REINTERP (aarch64_types_unop_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_binop_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_maybe_immediate }; +#define TYPES_BINOP (aarch64_types_binop_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_ternop_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_none, qualifier_none }; +#define TYPES_TE
Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p
On 18/11/13 08:37, Yvan Roux wrote: So, the validation is ok with this patch, I'm just not able to say if the original performance issue is still fixed with it. Could you check it Kyrylo ? Hi Yvan, I'll run the benchmark today to confirm the performance, but from compiling some code sequences that exhibited the bad behaviour in the past, I see that this patch still fixes the issues. store_minmaxsi is not generated when optimising for speed. Thanks, Kyrill Yvan 2013-11-17 Yvan Roux * config/arm/arm.md (store_minmaxsi): Use only when optimize_function_for_size_p. On 15 November 2013 15:59, Yvan Roux wrote: Sometimes 4 will be needed, since both original register values may remain live. Indeed. However, I'm inclined to agree that while it should be possible to decide at the *function* level whether or not an insn is valid, doing so at the block level is probably unsafe. Ok, so the attached patch should fix the issue, its validation is ongoing.
Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p
> Hi Yvan, > I'll run the benchmark today to confirm the performance, but from compiling > some code sequences that exhibited the bad behaviour in the past, I see that > this patch still fixes the issues. store_minmaxsi is not generated when > optimising for speed. Ok Cool, Thanks Kyrill Cheers, Yvan
RE: [PATCH, libgcc] Disable JCR section when java is not enabled
Ping, as wasting 8 bytes of RAM isn't ignorable on embedded system. OK to trunk stage 1? > -Original Message- > From: Tom Tromey [mailto:tro...@redhat.com] > Sent: Thursday, October 10, 2013 21:32 > To: Jakub Jelinek > Cc: Joey Ye; p...@bothner.com; a...@redhat.com; H.J. Lu; gcc-patches; 'Ian > Lance Taylor' > Subject: Re: [PATCH, libgcc] Disable JCR section when java is not enabled > > Jakub> Given the state of gcj that it is now only rarely used and most > Jakub> people just use OpenJDK instead, wouldn't it be a good idea to > Jakub> just require that gcj code is linked using gcj driver or, if > Jakub> linked in any other driver, just using a special non-default > Jakub> option (-flink-jcr or similar), that would be automatically set > Jakub> by gcj driver, move this JCR stuff out of the normal crt* files > Jakub> and put it into crtjava*.o instead, and only link in if > Jakub> -flink-jcr is passed or gcj driver used? Or treat -lgcj as that > Jakub> magic switch? > > The irony of the situation is that this would require significantly more work > than has gone into gcj in the past N years. > > Jakub> Also, looking at crtstuff.c makes me wonder where are classes > Jakub> deregistered, there are only calls to _Jv_RegisterClasses, but > Jakub> never to to deregistration, wonder what happens if you dlclose a > Jakub> shared library with registered classes. > > I think we never implemented class GC for compiled classes, though it's hard > to remember. > > Tom jcr_disable_non_java-130910.patch Description: Binary data
Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support
Ping 2013/11/8 Ilya Enkovich : > Hi, > > Here is an updated patch version with no langhook. > > Regarding TLS objects issue - I do not think compiler should compensate the > absence of instrumentation in libraries. Compiler should be responsible for > initialization of Bounds Tables for .tdata section. Correct data copy is a > responsibility of library. User should use either instrumented library or > wrapper calls if he needs this functionality. > > Thanks, > Ilya > -- > gcc/ > > 2013-11-06 Ilya Enkovich > > * c/c-parser.c: Include tree-chkp.h. > (c_parser_declaration_or_fndef): Register statically > initialized decls in Pointer Bounds Checker. > * cp/decl.c: Include tree-chkp.h. > (cp_finish_decl): Register statically > initialized decls in Pointer Bounds Checker. > * gimplify.c: Include tree-chkp.h. > (gimplify_init_constructor): Register statically > initialized decls in Pointer Bounds Checker. > > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c > index 9ccae3b..397b323 100644 > --- a/gcc/c/c-parser.c > +++ b/gcc/c/c-parser.c > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see > #include "cgraph.h" > #include "plugin.h" > #include "omp-low.h" > +#include "tree-chkp.h" > > > /* Initialization routine for this file. */ > @@ -1682,6 +1683,12 @@ c_parser_declaration_or_fndef (c_parser *parser, bool > fndef_ok, > maybe_warn_string_init (TREE_TYPE (d), init); > finish_decl (d, init_loc, init.value, >init.original_type, asm_name); > + > + /* Register all decls with initializers in Pointer > +Bounds Checker to generate required static bounds > +initializers. */ > + if (DECL_INITIAL (d) != error_mark_node) > + chkp_register_var_initializer (d); > } > } > else > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c > index 1e92f2a..74df02f 100644 > --- a/gcc/cp/decl.c > +++ b/gcc/cp/decl.c > @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see > #include "splay-tree.h" > #include "plugin.h" > #include "cgraph.h" > +#include "tree-chkp.h" > > /* Possible cases of bad specifiers type used by bad_specifiers. */ > enum bad_spec_place { > @@ -6379,6 +6380,12 @@ cp_finish_decl (tree decl, tree init, bool > init_const_expr_p, > the class specifier. */ > if (!DECL_EXTERNAL (decl)) > var_definition_p = true; > + > + /* If var has initilizer then we need to register it in > +Pointer Bounds Checker to generate static bounds initilizer > +if required. */ > + if (DECL_INITIAL (decl) && DECL_INITIAL (decl) != error_mark_node) > + chkp_register_var_initializer (decl); > } >/* If the variable has an array type, lay out the type, even if > there is no initializer. It is valid to index through the > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index 4f52c27..7aaac15 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see > #include "vec.h" > #include "omp-low.h" > #include "gimple-low.h" > +#include "tree-chkp.h" > > #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name */ > #include "tree-pass.h" /* FIXME: only for PROP_gimple_any */ > @@ -4111,6 +4112,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq > *pre_p, gimple_seq *post_p, > > walk_tree (&ctor, force_labels_r, NULL, NULL); > ctor = tree_output_constant_def (ctor); > + > + /* We need to register created constant object to > + initialize bounds for pointers in it. */ > + chkp_register_var_initializer (ctor); > + > if (!useless_type_conversion_p (type, TREE_TYPE (ctor))) > ctor = build1 (VIEW_CONVERT_EXPR, type, ctor); > TREE_OPERAND (*expr_p, 1) = ctor;
Re: [PATCH, MPX, 2/X] Pointers Checker [9/25] Bound constants
Ping 2013/11/7 Ilya Enkovich : > 2013/11/7 Jeff Law : >> On 10/31/13 03:15, Ilya Enkovich wrote: >>> >>> Hi, >>> >>> Here is a patch which adds support for bound constant to be used as >>> DECL_INITIAL for constant static bounds generated by compiler. >>> >>> Thanks, >>> Ilya >>> -- >>> >>> gcc/ >>> >>> 2013-10-23 Ilya Enkovich >>> >>> * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. >>> * explow.c (trunc_int_for_mode): Likewise. >>> * varpool.c (ctor_for_folding): Do not fold constant >>> bounds vars. >> >> I'm having a bit of trouble reconciling "add support for bound constant to >> be used as DECL_INITIAL" rationale text and the actual patch. >> >> From reading the patch it appears that you want to allow generation of >> immediate constants for objects with MODE_POINTER_BOUNDS. OK, I can see how >> that is useful. >> >> I can kindof see how you want to error out if someone asks for a constant to >> be truncated to MODE_POINTER_BOUNDS. Did this trip in practice or is it >> preemptive? > > As far as I remember change in trunc_int_mode was required to expand > bound constants on 32bit target. Size of the constant is equal to size > of the HOST_WIDE_INT and thus constant generation goes through > gen_int_mode and trunc_int_for_mode. > >> >> >> >>> diff --git a/gcc/varpool.c b/gcc/varpool.c >>> index 2eb1fc1..d9c08c1 100644 >>> --- a/gcc/varpool.c >>> +++ b/gcc/varpool.c >>> @@ -254,6 +254,12 @@ ctor_for_folding (tree decl) >>> && TREE_CODE (decl) != CONST_DECL) >>> return error_mark_node; >>> >>> + /* Static constant bounds are created to be >>> + used instead of constants and therefore >>> + do not let folding it. */ >>> + if (POINTER_BOUNDS_P (decl)) >>> +return error_mark_node; >> >> Here's the part I'm struggling a bit with.Why did you need this? >> >> Isn't this going to prevent that DECL from being used in folding? The >> bounds shouldn't really affect that AFAICT. > > Bounds constants were introduced only for initialization of constant > bound vars. Such vars are used to hold commonly used zero bounds (for > cases when bounds are unknown) values and null bounds (for null > pointers). Usage of such vars is optional and is controlled via > compiler flag. It is used to try to decrease overhead on bounds > creation. E.g. for MPX we need two instructions to create zero bounds > and also it require one GPR. One of these instructions does not become > nop when MPX is off which additionally increases overhead. Having > constant var we can just load bounds using one MPX instruction. And > if I do not prevent folding for these vars then all constant bounds > vars usages are replaced with immediate bounds constant usages and I > do not get desired effect. Since there are no instructions working > with bounds immediates, I do not see reasons for folding. > > Thanks, > Ilya > >> >> jeff
Re: Some wide-int review comments
Thanks for the changes. > @@ -8162,7 +8162,7 @@ fold_builtin_logarithm (location_t loc, > /* Prepare to do logN(exp10(exponent) -> exponent*logN(10). */ > { > REAL_VALUE_TYPE dconst10; > - real_from_integer (&dconst10, VOIDmode, 10, SIGNED); > + real_from_integer (&dconst10, VOIDmode, wi::shwi (10, 32), > SIGNED); > x = build_real (type, dconst10); > } > exponent = CALL_EXPR_ARG (arg, 0); > @@ -8315,7 +8315,7 @@ fold_builtin_pow (location_t loc, tree f > >/* Check for an integer exponent. */ >n = real_to_integer (&c); > - real_from_integer (&cint, VOIDmode, n, SIGNED); > + real_from_integer (&cint, VOIDmode, wi::shwi (n, > HOST_BITS_PER_WIDE_INT), SIGNED); >if (real_identical (&c, &cint)) > { > /* Attempt to evaluate pow at compile-time, unless this should Are these changes necessary? The original calls ought to work as-is, since the function takes a const wide_int_ref &. Same for the rest of the patch. > Index: gcc/c/c-parser.c > === > --- gcc/c/c-parser.c (revision 204918) > +++ gcc/c/c-parser.c (working copy) > @@ -13375,7 +13375,7 @@ c_parser_cilk_clause_vectorlength (c_par >|| !TREE_CONSTANT (expr) >|| !INTEGRAL_TYPE_P (TREE_TYPE (expr))) > error_at (loc, "vectorlength must be an integer constant"); > - else if (exact_log2 (tree_to_hwi (expr)) == -1) > + else if (wi::eq_p (wi::exact_log2 (expr), -1)) > error_at (loc, "vectorlength must be a power of 2"); >else > { FWIW: wi::exact_log2 (expr) == -1 should still work. > Index: gcc/dwarf2out.c > === > --- gcc/dwarf2out.c (revision 204918) > +++ gcc/dwarf2out.c (working copy) > @@ -13428,8 +13428,6 @@ loc_descriptor (rtx rtl, enum machine_mo > >if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict)) > { > - gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl)); > - > /* Note that a CONST_DOUBLE rtx could represent either an integer >or a floating-point constant. A CONST_DOUBLE is used whenever >the constant requires more than one word in order to be The copy of the CONST_DOUBLE comment is still there though. (This is handling CONST_WIDE_INT rather than CONST_DOUBLE.) > Index: gcc/tree.c > === > --- gcc/tree.c(revision 204918) > +++ gcc/tree.c(working copy) > @@ -8536,8 +8536,18 @@ retry: > return true; > >/* Third, unsigned integers with top bit set never fit signed types. */ > - if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED && wi::neg_p (c)) > -return false; > + if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED) > +{ > + int uprec = GET_MODE_PRECISION (TYPE_MODE TREE_TYPE (c)); > + if (uprec < TYPE_PRECISION (TREE_TYPE (c))) > + { > + wide_int x = wi::sext (c, uprec); > + if (wi::neg_p (x) || wi::ne_p (x, c)) > + return false; > + } > + else if (wi::neg_p (c)) > + return false; > +} > >/* If we haven't been able to decide at this point, there nothing more we > can check ourselves here. Look at the base type if we have one and it I don't really understand this change, but I suppose it's part of the trunk patch. Looks good to me otherwise FWIW. Thanks, Richard
[PATCH, MPX, 2/X] Pointers Checker [11/25] Expand builtins
Hi, Here is a patch to adopt some builtins expanding to Pointer Bounds Checker. Patch mostly disables inlining of instrumented calls to string function. Also adds support for _NOBND_NOCHK versions of string functions (this version does not check and copy bounds; therefore may be inlined as regular string function). Thanks, Ilya -- 2013-11-13 Ilya Enkovich * builtins.c: Include rtl-chkp.h, tree-chkp.h. (expand_builtin_mempcpy_args): Add orig exp as argument. Support BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK. (expand_builtin_mempcpy): Adjust expand_builtin_mempcpy_args call. (expand_builtin_stpcpy): Likewise. (expand_builtin_memset_args): Support BUILT_IN_CHKP_MEMSET_NOBND_NOCHK. (std_expand_builtin_va_start): Initialize bounds for va_list. (expand_builtin): Support instrumented calls. * optabs.c: Include rtl-chkp.h. (expand_unop): Handle bounds in libcall return value. diff --git a/gcc/builtins.c b/gcc/builtins.c index 7a04664..b46c364 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -50,6 +50,8 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "ubsan.h" #include "cilk.h" +#include "tree-chkp.h" +#include "rtl-chkp.h" static tree do_mpc_arg1 (tree, tree, int (*)(mpc_ptr, mpc_srcptr, mpc_rnd_t)); @@ -115,7 +117,7 @@ static rtx builtin_memcpy_read_str (void *, HOST_WIDE_INT, enum machine_mode); static rtx expand_builtin_memcpy (tree, rtx); static rtx expand_builtin_mempcpy (tree, rtx, enum machine_mode); static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx, - enum machine_mode, int); + enum machine_mode, int, tree); static rtx expand_builtin_strcpy (tree, rtx); static rtx expand_builtin_strcpy_args (tree, tree, rtx); static rtx expand_builtin_stpcpy (tree, rtx, enum machine_mode); @@ -3195,7 +3197,8 @@ expand_builtin_mempcpy (tree exp, rtx target, enum machine_mode mode) tree src = CALL_EXPR_ARG (exp, 1); tree len = CALL_EXPR_ARG (exp, 2); return expand_builtin_mempcpy_args (dest, src, len, - target, mode, /*endp=*/ 1); + target, mode, /*endp=*/ 1, + exp); } } @@ -3207,10 +3210,23 @@ expand_builtin_mempcpy (tree exp, rtx target, enum machine_mode mode) static rtx expand_builtin_mempcpy_args (tree dest, tree src, tree len, -rtx target, enum machine_mode mode, int endp) +rtx target, enum machine_mode mode, int endp, +tree orig_exp) { + tree fndecl = get_callee_fndecl (orig_exp); + /* If return value is ignored, transform mempcpy into memcpy. */ - if (target == const0_rtx && builtin_decl_implicit_p (BUILT_IN_MEMCPY)) + if (target == const0_rtx + && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK + && builtin_decl_implicit_p (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK)) +{ + tree fn = builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK); + tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3, + dest, src, len); + return expand_expr (result, target, mode, EXPAND_NORMAL); +} + else if (target == const0_rtx + && builtin_decl_implicit_p (BUILT_IN_MEMCPY)) { tree fn = builtin_decl_implicit (BUILT_IN_MEMCPY); tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3, @@ -3394,7 +3410,8 @@ expand_builtin_stpcpy (tree exp, rtx target, enum machine_mode mode) lenp1 = size_binop_loc (loc, PLUS_EXPR, len, ssize_int (1)); ret = expand_builtin_mempcpy_args (dst, src, lenp1, -target, mode, /*endp=*/2); +target, mode, /*endp=*/2, +exp); if (ret) return ret; @@ -3679,7 +3696,8 @@ expand_builtin_memset_args (tree dest, tree val, tree len, do_libcall: fndecl = get_callee_fndecl (orig_exp); fcode = DECL_FUNCTION_CODE (fndecl); - if (fcode == BUILT_IN_MEMSET) + if (fcode == BUILT_IN_MEMSET + || fcode == BUILT_IN_CHKP_MEMSET_NOBND_NOCHK) fn = build_call_nofold_loc (EXPR_LOCATION (orig_exp), fndecl, 3, dest, val, len); else if (fcode == BUILT_IN_BZERO) @@ -4231,6 +4249,13 @@ std_expand_builtin_va_start (tree valist, rtx nextarg) { rtx va_r = expand_expr (valist, NULL_RTX, VOIDmode, EXPAND_WRITE); convert_move (va_r, nextarg, 0); + + /* We do not have any valid bounds for the pointer, so + just store zero bounds for it. */ + if (chkp_function_instrumented_p (current_function_decl)) +chkp_expand_bounds_reset_for_mem (valist, + make_tree (TREE_TYPE (valist), +
[Ada] Internal error on renaming of constant array with -gnatc
This fixes an ICE on an obscure case with the -gnatc switch. Tested on x86_64-suse-linux, applied on the mainline. 2013-11-18 Eric Botcazou * gcc-interface/decl.c (gnat_to_gnu_entity) : Deal with an error mark as renamed object in type annotating mode. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 204913) +++ gcc-interface/decl.c (working copy) @@ -1117,8 +1117,12 @@ gnat_to_gnu_entity (Entity_Id gnat_entit as we have a VAR_DECL for the pointer we make. */ } - gnu_expr = build_unary_op (ADDR_EXPR, gnu_type, - maybe_stable_expr); + if (type_annotate_only + && TREE_CODE (maybe_stable_expr) == ERROR_MARK) + gnu_expr = NULL_TREE; + else + gnu_expr = build_unary_op (ADDR_EXPR, gnu_type, + maybe_stable_expr); gnu_size = NULL_TREE; used_by_ref = true;
[PATCH, MPX, 2/X] Pointers Checker [12/25] Expand bounded params
Hi, Here is a patch to expand bounded arguments for calls, input bounds for bounded params and returned bounds. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * calls.c: Include tree-chkp.h, rtl-chkp.h. (arg_data): Add fields for bounds information and to store value pushed to the stack. (emit_call_1): Propagate instrumentation flag for CALL. (precompute_register_parameters): Expand bounds of the arg. (initialize_argument_information): Fill bounds information. (load_register_parameters): Load passed bounds. (expand_call): Handle passed and returned bounds. (emit_library_call_value_1): Filter out returned bounds. (store_one_arg): Store bounds of arg. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate instrumentation flag for CALL_EXPR. (expand_return): Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * explow.c: Include rtl-chkp.h. (hard_function_value): Handle returned bounds. * expr.h (store_expr): Add param for bounds target. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr): Likewise. (store_constructor): Adjust to new store_expr signature. (store_field): Likewise. (expand_expr_real_2): Likewise. (expand_expr_real_1): Likewise. * tree-outof-ssa.c (insert_value_copy_on_edge): Adjust to new store_expr signature. * function.c: Include tree-chkp.h, rtl-chkp.h. (aggregate_value_p): Handle returned bounds. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parm_data_one): Add field for bounds. (assign_parm_find_entry_rtl): Fill bounds info. (assign_parms): Initialize input bounds for args. (expand_function_start): Handle returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. (expand_function_end): Likewise. * function.h (rtl_data): Add field for returned bounds. diff --git a/gcc/calls.c b/gcc/calls.c index 4dcdb27..7ebf310 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -40,6 +40,8 @@ along with GCC; see the file COPYING3. If not see #include "cgraph.h" #include "except.h" #include "dbgcnt.h" +#include "tree-chkp.h" +#include "rtl-chkp.h" /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits. */ #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT) @@ -50,12 +52,20 @@ struct arg_data { /* Tree node for this argument. */ tree tree_value; + /* Bounds tree node for this argument. */ + tree bounds_value; + /* Bounds RTL for this argument. */ + rtx bounds; + /* Slot to be used to pass bounds. */ + rtx bounds_slot; /* Mode for value; TYPE_MODE unless promoted. */ enum machine_mode mode; /* Current RTL value for argument, or 0 if it isn't precomputed. */ rtx value; /* Initially-compute RTL value for argument; only for const functions. */ rtx initial_value; + /* Pushed value. */ + rtx pushed_value; /* Register to pass this argument in, 0 if passed on stack, or an PARALLEL if the arg is to be copied into multiple non-contiguous registers. */ @@ -387,6 +397,10 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU && MEM_EXPR (funmem) != NULL_TREE) set_mem_expr (XEXP (call, 0), MEM_EXPR (funmem)); + /* Mark instrumented calls. */ + if (call && fntree) +CALL_EXPR_WITH_BOUNDS_P (call) = CALL_WITH_BOUNDS_P (fntree); + /* Put the register usage information there. */ add_function_usage_to (call_insn, call_fusage); @@ -865,6 +879,59 @@ precompute_register_parameters (int num_actuals, struct arg_data *args, && targetm.small_register_classes_for_mode_p (args[i].mode)) || optimize)) args[i].value = copy_to_mode_reg (args[i].mode, args[i].value); + + /* Expand argument bounds if any. */ + if (args[i].bounds_value) + { + /* If bounds_value is a list then we pass bounds + for a structure. TREE_VALUE of each node holds + base structure address and TREE_PURPOSE holds + corresponding offset of the pointer field in + structure. */ + if (TREE_CODE (args[i].bounds_value) == TREE_LIST) + { + tree node = args[i].bounds_value; + unsigned bnd_num = list_length (args[i].bounds_value); + rtx *bounds = XALLOCAVEC (rtx, bnd_num); + unsigned bnd_no = 0; + tree base_addr = TREE_VALUE (node); + rtx base = expand_normal (base_addr); + + /* Expand all nodes in the list. */ + while (node) +
[Ada] Fix unexpected read of volatile scalar for Out parameter
This fixes an old issue in the compiler, whereby it unexpectedly generates a read of a volatile variable with scalar type passed as Out parameter to a subprogram; the subtlety being that the side-effects of the parameter viewed as a name still need to be evaluated prior to the call. Tested on x86_64-suse-linux, applied on the mainline. 2013-11-18 Eric Botcazou * gcc-interface/trans.c (Call_to_gnu): For an Out parameter passed by copy and that don't need to be copied in, only evaluate its address. 2013-11-18 Eric Botcazou * gnat.dg/volatile11.adb: New test. * gnat.dg/volatile11_pkg.ad[sb]: New helper. -- Eric BotcazouIndex: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 204913) +++ gcc-interface/trans.c (working copy) @@ -4130,9 +4130,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn gnu_name = convert (TREE_TYPE (TYPE_FIELDS (TREE_TYPE (gnu_name))), gnu_name); - /* If we have not saved a GCC object for the formal, it means it is an - Out parameter not passed by reference and that need not be copied in. - Otherwise, first see if the parameter is passed by reference. */ + /* First see if the parameter is passed by reference. */ if (is_true_formal_parm && DECL_BY_REF_P (gnu_formal)) { if (Ekind (gnat_formal) != E_In_Parameter) @@ -4178,6 +4176,9 @@ Call_to_gnu (Node_Id gnat_node, tree *gn gnu_formal_type = TREE_TYPE (gnu_formal); gnu_actual = build_unary_op (ADDR_EXPR, gnu_formal_type, gnu_actual); } + + /* Then see if the parameter is an array passed to a foreign convention + subprogram. */ else if (is_true_formal_parm && DECL_BY_COMPONENT_PTR_P (gnu_formal)) { gnu_formal_type = TREE_TYPE (gnu_formal); @@ -4198,6 +4199,8 @@ Call_to_gnu (Node_Id gnat_node, tree *gn but this is the most likely to work in all cases. */ gnu_actual = build_unary_op (ADDR_EXPR, gnu_formal_type, gnu_actual); } + + /* Then see if the parameter is passed by descriptor. */ else if (is_true_formal_parm && DECL_BY_DESCRIPTOR_P (gnu_formal)) { gnu_actual = convert (gnu_formal_type, gnu_actual); @@ -4214,6 +4217,8 @@ Call_to_gnu (Node_Id gnat_node, tree *gn (TREE_TYPE (TREE_TYPE (gnu_formal)), gnu_actual, gnat_actual)); } + + /* Otherwise the parameter is passed by copy. */ else { tree gnu_size; @@ -4221,11 +4226,18 @@ Call_to_gnu (Node_Id gnat_node, tree *gn if (Ekind (gnat_formal) != E_In_Parameter) gnu_name_list = tree_cons (NULL_TREE, gnu_name, gnu_name_list); + /* If we didn't create a PARM_DECL for the formal, this means that + it is an Out parameter not passed by reference and that need not + be copied in. In this case, the value of the actual need not be + read. However, we still need to make sure that its side-effects + are evaluated before the call, so we evaluate its address. */ if (!is_true_formal_parm) { - /* Make sure side-effects are evaluated before the call. */ if (TREE_SIDE_EFFECTS (gnu_name)) - append_to_statement_list (gnu_name, &gnu_stmt_list); + { + tree addr = build_unary_op (ADDR_EXPR, NULL_TREE, gnu_name); + append_to_statement_list (addr, &gnu_stmt_list); + } continue; } -- { dg-do run } -- { dg-options "-O -gnatp" } with Volatile11_Pkg; use Volatile11_Pkg; procedure Volatile11 is Value : Integer := 1; Bit1 : Boolean := false; pragma Volatile (Bit1); Bit2 : Boolean := false; pragma Volatile (Bit2); Bit3 : Boolean := false; pragma Volatile (Bit3); Bit4 : Boolean := false; pragma Volatile (Bit4); Bit5 : Boolean := false; pragma Volatile (Bit5); Bit6 : Boolean := false; pragma Volatile (Bit6); Bit7 : Boolean := false; pragma Volatile (Bit7); Bit8 : Boolean := false; pragma Volatile (Bit8); begin Bit_Test(Input => Value, Output1 => Bit1, Output2 => Bit2, Output3 => Bit3, Output4 => Bit4, Output5 => Bit5, Output6 => Bit6, Output7 => Bit7, Output8 => F.all); -- Check that F is invoked before Bit_Test if B /= True then raise Program_Error; end if; end; package body Volatile11_Pkg is procedure Bit_Test(Input : in Integer; Output1 : out Boolean; Output2 : out Boolean; Output3 : out Boolean; Output4 : out Boolean; Output5 : out Boolean; Output6 : out Boolean; Output7 : out Boolean; Output8 : out Boolean) is begin Output8 := B; Output7 := Input = 7; Output6 := Input = 6; Output5 := Input = 5; Output4 := Input = 4; Output3 := Input = 3; Output2 := Input = 2; Output1 := Input = 1; end Bit_Test; func
Clean up LTO generation
I'm looking into using LTO to pass information between compilers for different targets, for the OpenACC implementation. This area of the compiler seems somewhat obfuscated by overengineering, and I'd like to simplify it first to make it easier to work with. lto_gimple_out and lto_finish_out aren't real passes, only their write_summary methods are nonnull. The only thing we really do to emit LTO is just to make two function calls, to lto_output and produce_asm_for_decls, but this is wrapped in a lot of pass_manager boilerplate to confuse the reader. Other downsides are bogus empty dump files left behind, and plain dead code like ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes); The following patch simplifies this by getting rid of all_lto_gen_passes. We could simplify a little further if we decided we really don't need two different timevars for different steps of LTO output. Bootstrapped and tested on x86_64-linux, ok? Bernd * cgraphunit.c (ipa_passes): Don't execute all_lto_gen_passes. * lto-streamer-out.c (lto_output, produce_asm_for_decls): No longer static. (pass_data_ipa_lto_gimple_out, pass_ipa_lto_gimple_out, make_pass_ipa_lto_gimple_out, pass_data_ipa_lto_finish_out, pass_ipa_lto_finish_out, make_pass_ipa_lto_finish_out): Remove. * lto-streamer.h (lto_output, produce_asm_for_decls): Declare. * pass-manager.h (GCC_PASS_LISTS, class pass_manager): Remove all_to_gen_passes. * passes.c (pass_manager::dump_passes): Remove its use. (pass_manager::register_pass): Likewise. (ipa_read_summaries, ipa_read_optimization_summaries): Likewise. (pass_manager::pass_manager): Don't initialize or use it. (write_lto): New static function. (ipa_write_summaries_1, ipa_write_optimization_summaries): Use it instead of using all_lto_gen_passes. * passes.def (all_to_gen_passes, pass_ipa_lto_gimple_out, pass_ipa_lto_finish_out): Delete. * tree-pass.h (make_pass_ipa_lto_gimple_out, make_pass_ipa_lto_finish_out): Don't declare. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 8ab274b..e815be0 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -2018,9 +2018,6 @@ ipa_passes (void) if (flag_generate_lto) targetm.asm_out.lto_start (); - execute_ipa_summary_passes ((struct ipa_opt_pass_d *) - passes->all_lto_gen_passes); - if (!in_lto_p) ipa_write_summaries (); diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index 9a00230..5176e35 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -1966,7 +1966,7 @@ copy_function (struct cgraph_node *node) /* Main entry point from the pass manager. */ -static void +void lto_output (void) { struct lto_out_decl_state *decl_state; @@ -2016,53 +2016,6 @@ lto_output (void) #endif } -namespace { - -const pass_data pass_data_ipa_lto_gimple_out = -{ - IPA_PASS, /* type */ - "lto_gimple_out", /* name */ - OPTGROUP_NONE, /* optinfo_flags */ - true, /* has_gate */ - false, /* has_execute */ - TV_IPA_LTO_GIMPLE_OUT, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - 0, /* todo_flags_finish */ -}; - -class pass_ipa_lto_gimple_out : public ipa_opt_pass_d -{ -public: - pass_ipa_lto_gimple_out (gcc::context *ctxt) -: ipa_opt_pass_d (pass_data_ipa_lto_gimple_out, ctxt, - NULL, /* generate_summary */ - lto_output, /* write_summary */ - NULL, /* read_summary */ - lto_output, /* write_optimization_summary */ - NULL, /* read_optimization_summary */ - NULL, /* stmt_fixup */ - 0, /* function_transform_todo_flags_start */ - NULL, /* function_transform */ - NULL) /* variable_transform */ - {} - - /* opt_pass methods: */ - bool gate () { return gate_lto_out (); } - -}; // class pass_ipa_lto_gimple_out - -} // anon namespace - -ipa_opt_pass_d * -make_pass_ipa_lto_gimple_out (gcc::context *ctxt) -{ - return new pass_ipa_lto_gimple_out (ctxt); -} - - /* Write each node in encoded by ENCODER to OB, as well as those reachable from it and required for correct representation of its semantics. Each node in ENCODER must be a global declaration or a type. A node @@ -2382,7 +2335,7 @@ produce_symtab (struct output_block *ob) this file to be written in to a section that can then be read in to recover these on other side. */ -static void +void produce_asm_for_decls (void) { struct lto_out_decl_state *out_state; @@ -2486,50 +2439,3 @@ produce_asm_for_decls (void) lto_function_decl_states.release (); destroy_output_block (ob); } - - -namespace { - -const pass_data pass_data_ipa_lto_finish_out = -{ - IPA_PASS, /* type */ - "lto_decls_out", /* name */ - OPTGROUP_NONE, /* optinfo_flags */ - true, /* has_gate */ - false, /* has_execute */ - TV_IPA_LTO_DECL_OUT, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - 0, /* todo_fl
Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx patterns in IVOPT
Ping^2 Thanks, bin On Tue, Nov 12, 2013 at 3:08 PM, bin.cheng wrote: > Ping in this one. > Hi Bernd, could you please help us on this one? > Sorry for the inconvenience. > > Thanks, > bin > >> -Original Message- >> From: Bin.Cheng [mailto:amker.ch...@gmail.com] >> Sent: Monday, November 04, 2013 8:56 PM >> To: Richard Biener >> Cc: Bin Cheng; Bernd Schmidt; GCC Patches >> Subject: Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx >> patterns in IVOPT >> >> On Mon, Nov 4, 2013 at 7:38 PM, Richard Biener >> wrote: >> > On Mon, Nov 4, 2013 at 4:31 AM, bin.cheng wrote: >> >> Hi, >> >> >> >> The IVOPT in GCC has a problem that it does not use cost of >> >> auto-increment address expression in accounting, while it retreats to >> >> cost of address expression if auto-increment addressing mode is >> unavailable. >> >> For example, on ARM target: >> >> 1) the cost of "[reg]" (which is 6) is used for address expression >> >> "[reg], #off"; >> >> 2) the cost of "[reg+off]" (which is 2) is used for address >> >> expression "[reg, #off]!"; >> >> >> >> This causes: >> >> 1) cost of non-auto increment address expression is used for >> >> auto-increment address expression; >> >> 2) different address costs are used for pre/post increment address >> >> expressions. >> >> This patch fixes the problem by computing, caching and using the cost >> >> of auto-increment address expressions. >> >> >> >> Bootstrap and test on x86/arm. Is it OK? >> > >> > But don't you need to adjust >> > >> > static bool >> > determine_use_iv_cost_address (struct ivopts_data *data, >> >struct iv_use *use, struct iv_cand >> > *cand) { >> > bitmap depends_on; >> > bool can_autoinc; >> > int inv_expr_id = -1; >> > comp_cost cost = get_computation_cost (data, use, cand, true, >> &depends_on, >> > &can_autoinc, &inv_expr_id); >> > >> > if (cand->ainc_use == use) >> > { >> > if (can_autoinc) >> > cost.cost -= cand->cost_step; >> > >> > this which seems to try to compensate for your issue? >> That's where problem gets complicated depending on how backend defines >> address cost. For back ends define cost according to the true cost of >> addressing mode approximately, the address cost of auto-increment >> addressing mode doesn't take the saved stepping instruction into >> consideration, so the code is necessary. >> Moreover, according to gcc internal's description about >> TARGET_ADDRESS_COST, RISC machines may define different address cost >> for addressing modes which actually have equal execution on micro- >> architecture level (like ARM for now). The problems are: >> 1) Though address costs are defined in this "discriminated" way, it's > unlikely >> to have the saved stepping instruction considered either. >> The address cost of auto-increment address expression shouldn't go so far. >> 2) We think the "discriminated" address cost model is established before >> gimple pass and is outdated. The true micro-architecture address cost (or >> cost normalized with COSTS_N_INSNS) should be used in GCC nowadays. >> The rtl passes like fwprop_addr which use address cost as heuristic >> information should be refined... but that's totally another problem (am >> investigating it). >> >> So overall, I think the stepping cost should to be subtracted here. >> >> > >> > Or maybe I don't understand. >> > >> > CCing Bernd who implemented this IIRC. >> Any suggestions appreciated. >> >> Thanks. >> bin >> >> -- >> Best Regards. > > > > -- Best Regards.
Re: [PATCH] Time profiler - phase 2
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 5cb07b7..754f882 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,13 @@ > +2013-11-17 Martin Liska > + Jan Hubicka > + > + * cgraphunit.c (node_cmp): New function. > + (expand_all_functions): Function ordering added. > + * common.opt: New profile based function reordering flag introduced. > + * lto-partition.c: Support for time profile added. > + * lto.c: Likewise. > + * predict.c (handle_missing_profiles): Time profile handled in > + missing profiles. OK, thanks! Implementing the function section naming scheme would be easy and it would enable us to do the reordering even w/o LTO that would be quite cool. Lets hope it gets resolved soon. Honza
[Ada] Remove redundant define
Tested on x86_64-suse-linux, applied on the mainline. 2013-11-18 Eric Botcazou * gcc-interface/trans.c (TARGET_ABI_OPEN_VMS): Delete as redundant. -- Eric BotcazouIndex: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 204943) +++ gcc-interface/trans.c (working copy) @@ -66,12 +66,6 @@ instead. */ #define ALLOCA_THRESHOLD 1000 -/* Let code below know whether we are targeting VMS without need of - intrusive preprocessor directives. */ -#ifndef TARGET_ABI_OPEN_VMS -#define TARGET_ABI_OPEN_VMS 0 -#endif - /* In configurations where blocks have no end_locus attached, just sink assignments into a dummy global. */ #ifndef BLOCK_SOURCE_END_LOCATION
[PATCH, MPX, 2/X] Pointers Checker [13/25] Inline support
Hi, Here is a patch to add instumented code support in inliner. Main changes are: - Replace BUILT_IN_CHKP_ARG_BND calls in inlined code with actually passed bounds - Replace BUILT_IN_CHKP_BNDRET in caller with returned bounds - Add bounds copy for assignments generated by inliner Thanks, Ilya -- 2013-11-15 Ilya Enkovich * ipa-inline.c (early_inliner): Check edge has summary allocated. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Replace BUILT_IN_CHKP_ARG_BND built-in call with actual arg bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. (initialize_inlined_parameters): Initialize bounds of params. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * cgraph.c: Include tree-chkp.h. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 2841055..549703d 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see #include "ipa-inline.h" #include "cfgloop.h" #include "gimple-pretty-print.h" +#include "tree-chkp.h" /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. */ #include "tree-pass.h" @@ -1380,6 +1381,28 @@ cgraph_redirect_edge_call_stmt_to_callee (struct cgraph_edge *e) e->speculative = false; cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false); + if (gimple_call_with_bounds_p (new_stmt) + && gimple_call_lhs (new_stmt) + && chkp_retbnd_call_by_val (gimple_call_lhs (e2->call_stmt))) + { + tree dresult = gimple_call_lhs (new_stmt); + tree iresult = gimple_call_lhs (e2->call_stmt); + gimple dbndret = chkp_retbnd_call_by_val (dresult); + gimple ibndret = chkp_retbnd_call_by_val (iresult); + struct cgraph_edge *iedge = cgraph_edge (e2->caller, ibndret); + struct cgraph_edge *dedge; + + if (dbndret) + { + dedge = cgraph_create_edge (iedge->caller, iedge->callee, + dbndret, e->count, + e->frequency); + dedge->frequency = compute_call_stmt_bb_frequency + (dedge->caller->decl, gimple_bb (dedge->call_stmt)); + } + iedge->frequency = compute_call_stmt_bb_frequency + (iedge->caller->decl, gimple_bb (iedge->call_stmt)); + } e->frequency = compute_call_stmt_bb_frequency (e->caller->decl, gimple_bb (e->call_stmt)); e2->frequency = compute_call_stmt_bb_frequency diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index 85f8e5d..ecf7572 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -2267,11 +2267,15 @@ early_inliner (void) info that might be cleared out for newly discovered edges. */ for (edge = node->callees; edge; edge = edge->next_callee) { - struct inline_edge_summary *es = inline_edge_summary (edge); - es->call_stmt_size - = estimate_num_insns (edge->call_stmt, &eni_size_weights); - es->call_stmt_time - = estimate_num_insns (edge->call_stmt, &eni_time_weights); + /* We have no summary for new bound store calls yet. */ + if (inline_edge_summary_vec.length () > (unsigned)edge->uid) + { + struct inline_edge_summary *es = inline_edge_summary (edge); + es->call_stmt_size + = estimate_num_insns (edge->call_stmt, &eni_size_weights); + es->call_stmt_time + = estimate_num_insns (edge->call_stmt, &eni_time_weights); + } if (edge->callee->decl && !gimple_check_call_matching_types ( edge->call_stmt, edge->callee->decl, false)) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 1e1285a..d2f2337 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-pass.h" #include "target.h" #include "cfgloop.h" +#include "tree-chkp.h" #include "rtl.h" /
[Ping, avr] Emit diagnostics if -f{pic,PIC,pie,PIE} or -shared is passed
Ping! Regards Senthil On Mon, Nov 04, 2013 at 06:45:19PM +0530, Senthil Kumar Selvaraj wrote: > The AVR backend does not generate position independent code, yet it > happily accepts -fpic, -fPIC, -fpie and -fPIE. The generated code > doesn't change at all. Also, it accepts the -shared option to generate a > shared library, without really doing anything with it. > > This causes one of the regression tests > (gcc.dg/lto/pr54709 c_lto_pr54709_0.o-c_lto_pr54709_1.o link) to fail with > an 'undefined reference to main' error, when the test is trying to build > a shared object. > > The attached patch generates a warning if one of the -f{pic,PIC,pie,PIE} > options is provided, and an error if -shared is provided ( > config/mep/mep.c and config/s390/tpf.h already do something very similar). > > Regression tested with no new failures.Tests which exercise PIC now report as > unsupported. > > If ok, could someone commit please? I don't have commit access. > > Regards > Senthil > > gcc/ChangeLog > 2013-11-04 Senthil Kumar Selvaraj > > * config/avr/avr.c (avr_option_override): Warn if asked to generate > position independent code. > * config/avr/avr.h: Modify LINK_SPEC to reject -shared. > > > diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c > index e7e1c2f..cf4b8ca 100644 > --- gcc/config/avr/avr.c > +++ gcc/config/avr/avr.c > @@ -310,6 +310,15 @@ avr_option_override (void) >flag_omit_frame_pointer = 0; > } > > + if (flag_pic == 1) > +warning (OPT_fpic, "-fpic is not supported"); > + if (flag_pic == 2) > +warning (OPT_fPIC, "-fPIC is not supported"); > + if (flag_pie == 1) > +warning (OPT_fpie, "-fpie is not supported"); > + if (flag_pie == 2) > +warning (OPT_fPIE, "-fPIE is not supported"); > + >avr_current_device = &avr_mcu_types[avr_mcu_index]; >avr_current_arch = &avr_arch_types[avr_current_device->arch]; > > diff --git gcc/config/avr/avr.h gcc/config/avr/avr.h > index f223a61..1eff5be 100644 > --- gcc/config/avr/avr.h > +++ gcc/config/avr/avr.h > @@ -522,7 +522,8 @@ extern const char *avr_device_to_sp8 (int argc, const > char **argv); > mmcu=at90can64*|\ > mmcu=at90usb64*:--pmem-wrap-around=64k}}}\ > %:device_to_ld(%{mmcu=*:%*})\ > -%:device_to_data_start(%{mmcu=*:%*})" > +%:device_to_data_start(%{mmcu=*:%*})\ > +%{shared:%eshared is not supported}" > > #define LIB_SPEC \ > > "%{!mmcu=at90s1*:%{!mmcu=attiny11:%{!mmcu=attiny12:%{!mmcu=attiny15:%{!mmcu=attiny28: > -lc }"
[PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting
Hi, Here is a patch to disable splitting when bounds transfer is required for splitted function. Thanks, Ilya -- 2013-11-13 Ilya Enkovich * ipa-split.c: Include tree-chkp.h. (consider_split): Do not split when splitted part needs bounds transfer. diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c index e55b3f5..1c3df61 100644 --- a/gcc/ipa-split.c +++ b/gcc/ipa-split.c @@ -101,6 +101,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-pretty-print.h" #include "ipa-inline.h" #include "cfgloop.h" +#include "tree-chkp.h" /* Per basic block info. */ @@ -378,6 +379,7 @@ consider_split (struct split_point *current, bitmap non_ssa_vars, int incoming_freq = 0; tree retval; bool back_edge = false; + bitmap_iterator bi; if (dump_file && (dump_flags & TDF_DETAILS)) dump_split_point (dump_file, current); @@ -486,6 +488,28 @@ consider_split (struct split_point *current, bitmap non_ssa_vars, if (!VOID_TYPE_P (TREE_TYPE (current_function_decl))) call_overhead += estimate_move_cost (TREE_TYPE (current_function_decl)); + /* Currently bounds passing and return is not supported for + splitted functions. */ + EXECUTE_IF_SET_IN_BITMAP (current->ssa_names_to_pass, 0, i, bi) +{ + if (POINTER_BOUNDS_P (ssa_name (i))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, +" Refused: need to pass bounds\n"); + return; + } +} + + if (chkp_function_instrumented_p (current_function_decl) + && chkp_type_has_pointer (TREE_TYPE (current_function_decl))) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, +" Refused: need to return bounds\n"); + return; +} + if (current->split_size <= call_overhead) { if (dump_file && (dump_flags & TDF_DETAILS))
[PATCH, MPX, 2/X] Pointers Checker [15/25] IPA Propagation
Hi, Here is a patch to disable propagation of bounded values. Thanks, Ilya -- 2013-11-13 Ilya Enkovich * ipa-prop.c: Include tree-chkp.h. (ipa_compute_jump_functions_for_edge): Do not propagate bounded args. diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index eb464e4..81e1237 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-streamer.h" #include "params.h" #include "ipa-utils.h" +#include "tree-chkp.h" /* Intermediate information about a parameter that is only useful during the run of ipa_analyze_node and is not kept afterwards. */ @@ -1558,6 +1559,7 @@ ipa_compute_jump_functions_for_edge (struct param_analysis_info *parms_ainfo, struct ipa_node_params *info = IPA_NODE_REF (cs->caller); struct ipa_edge_args *args = IPA_EDGE_REF (cs); gimple call = cs->call_stmt; + tree fndecl = gimple_call_fndecl (call); int n, arg_num = gimple_call_num_args (call); if (arg_num == 0 || args->jump_functions) @@ -1575,7 +1577,13 @@ ipa_compute_jump_functions_for_edge (struct param_analysis_info *parms_ainfo, tree arg = gimple_call_arg (call, n); tree param_type = ipa_get_callee_param_type (cs, n); - if (is_gimple_ip_invariant (arg)) + /* No optimization for bounded types yet implemented. */ + if ((gimple_call_with_bounds_p (call) + || (fndecl && chkp_function_instrumented_p (fndecl))) + && ((param_type && chkp_type_has_pointer (param_type)) + || (!param_type && chkp_type_has_pointer (TREE_TYPE (arg) + continue; + else if (is_gimple_ip_invariant (arg)) ipa_set_jf_constant (jfunc, arg, cs); else if (!is_gimple_reg_type (TREE_TYPE (arg)) && TREE_CODE (arg) == PARM_DECL)
RE: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM
Hi, On Fri, 15 Nov 2013 13:30:51, Richard Biener wrote: >> That looks like always pretending it is a bit field. >> But it is not a bit field, and bitregion_start=bitregion_end=0 >> means it is an ordinary value. > > I don't think it is supposed to mean that. It's supposed to mean > "the access is unconstrained". > Ok, agreed, I did not regard that as a feature. And apparently only the code path in expand_assigment really has a problem with it. So here my second attempt at fixing this. Boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk? Thanks Bernd.2013-11-18 Bernd Edlinger Fix C++0x memory model for unaligned fields in packed, aligned(4) structures with -fno-strict-volatile-bitfields on STRICT_ALIGNMENT targets like arm-none-eabi. * expr.c (expand_assignment): Handle normal fields like bit regions. testsuite: 2013-11-18 Bernd Edlinger * gcc.dg/pr56997-4.c: New testcase. patch-unaligned-data.diff Description: Binary data
[PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion
Hi, Here is a patch to disable tail recursion transformation when bounds are passed by call. The reason is BUILT_IN_CHKP_ARG_BND which should always get default SSA_NAME of PARM_DECL as an argument. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * tree-tailcall.c: Include tree-chkp.h. (suitable_for_tail_opt_p): Disable tail recursion for instrumented functions with bounded args. diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c index 185bf16..59845950 100644 --- a/gcc/tree-tailcall.c +++ b/gcc/tree-tailcall.c @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "common/common-target.h" #include "ipa-utils.h" +#include "tree-chkp.h" /* The file implements the tail recursion elimination. It is also used to analyze the tail calls in general, passing the results to the rtl level @@ -141,6 +142,20 @@ suitable_for_tail_opt_p (void) if (cfun->stdarg) return false; + /* Tail recursion elimination may cause arg_bnd builtins to be called + not for PARM_DECL which is not allowed now. Avoid optimization + in such cases for now. */ + if (chkp_function_instrumented_p (current_function_decl)) +{ + tree param; + + for (param = DECL_ARGUMENTS (current_function_decl); + param; + param = DECL_CHAIN (param)) + if (BOUNDED_P (param)) + return false; +} + return true; } /* Returns false when the function is not suitable for tail call optimization
[PATCH, MPX, 2/X] Pointers Checker [17/25] IPA SRA
Hi, Here is a patch remove bounded params from candidates for reduction. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * tree-sra.c: Include tree-chkp.h. (find_param_candidates): Bounded param is not a candidate. diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index d97af13..eec2b6b 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -102,6 +102,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-pretty-print.h" #include "ipa-inline.h" #include "ipa-utils.h" +#include "tree-chkp.h" /* Enumeration of all aggregate reductions we can do. */ enum sra_mode { SRA_MODE_EARLY_IPA, /* early call regularization */ @@ -3667,7 +3668,9 @@ find_param_candidates (void) if (TREE_THIS_VOLATILE (parm) || TREE_ADDRESSABLE (parm) - || (!is_gimple_reg_type (type) && is_va_list_type (type))) + || (!is_gimple_reg_type (type) && is_va_list_type (type)) + || (chkp_function_instrumented_p (current_function_decl) + && chkp_type_has_pointer (type))) continue; if (is_unused_scalar_param (parm))
Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p
On Sat, Nov 16, 2013 at 1:55 PM, Richard Sandiford wrote: > Add tree_fits_shwi_p and tree_fits_uhwi_p. The implementations are taken > directly from host_integerp. tree_ is a bit generic - you only ever return true for INTEGER_CSTs, so please use int_fits_[su]hwi_p please (mimicing int_fits_type_p). Ok with that change. Thanks, Richard. > Thanks, > Richard > > > gcc/ > * tree.h (tree_fits_shwi_p, tree_fits_uhwi_p): Declare. > * tree.c (tree_fits_shwi_p, tree_fits_uhwi_p): Define. > > Index: gcc/tree.h > === > --- gcc/tree.h 2013-11-16 09:09:56.388037088 + > +++ gcc/tree.h 2013-11-16 09:11:53.535874667 + > @@ -3659,6 +3659,16 @@ extern int host_integerp (const_tree, in >ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled. */ > #endif >; > +extern bool tree_fits_shwi_p (const_tree) > +#ifndef ENABLE_TREE_CHECKING > + ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. > */ > +#endif > + ; > +extern bool tree_fits_uhwi_p (const_tree) > +#ifndef ENABLE_TREE_CHECKING > + ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. > */ > +#endif > + ; > extern HOST_WIDE_INT tree_low_cst (const_tree, int); > #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003) > extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > Index: gcc/tree.c > === > --- gcc/tree.c 2013-11-16 09:09:56.388037088 + > +++ gcc/tree.c 2013-11-16 09:11:53.534874659 + > @@ -6990,6 +6990,32 @@ host_integerp (const_tree t, int pos) > || (pos && TREE_INT_CST_HIGH (t) == 0))); > } > > +/* Return true if T is an INTEGER_CST whose numerical value (extended > + according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT. */ > + > +bool > +tree_fits_shwi_p (const_tree t) > +{ > + return (t != NULL_TREE > + && TREE_CODE (t) == INTEGER_CST > + && ((TREE_INT_CST_HIGH (t) == 0 > + && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0) > + || (TREE_INT_CST_HIGH (t) == -1 > + && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0 > + && !TYPE_UNSIGNED (TREE_TYPE (t); > +} > + > +/* Return true if T is an INTEGER_CST whose numerical value (extended > + according to TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT. */ > + > +bool > +tree_fits_uhwi_p (const_tree t) > +{ > + return (t != NULL_TREE > + && TREE_CODE (t) == INTEGER_CST > + && TREE_INT_CST_HIGH (t) == 0); > +} > + > /* Return the HOST_WIDE_INT least significant bits of T if it is an > INTEGER_CST and there is no overflow. POS is nonzero if the result must > be non-negative. We must be able to satisfy the above conditions. */
Re: [5/10] Add tree_to_shwi and tree_to_uhwi
On Sat, Nov 16, 2013 at 2:03 PM, Richard Sandiford wrote: > Add tree_to_shwi and tree_to_uhwi. Initially tree_to_uhwi returns a > HOST_WIDE_INT, so that it's a direct replacement for tree_low_cst. > Patch 10 makes it return unsigned HOST_WIDE_INT instead. Possibly same naming issue (though we already do have the weird int_cst_value) - int_to_[us]hwi ()? OTOH if somebody else is fine with using tree_ for the four functions then call it a day. Richard. > Thanks, > Richard > > > gcc/ > * tree.h (tree_to_shwi, tree_to_uhwi): Declare, with inline > expansions. > * tree.c (tree_to_shwi, tree_to_uhwi): New functions. > > Index: gcc/tree.c > === > --- gcc/tree.c 2013-11-15 16:46:27.420395607 + > +++ gcc/tree.c 2013-11-15 16:47:15.226216885 + > @@ -7027,6 +7027,28 @@ tree_low_cst (const_tree t, int pos) >return TREE_INT_CST_LOW (t); > } > > +/* T is an INTEGER_CST whose numerical value (extended according to > + TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT. Return that > + HOST_WIDE_INT. */ > + > +HOST_WIDE_INT > +tree_to_shwi (const_tree t) > +{ > + gcc_assert (tree_fits_shwi_p (t)); > + return TREE_INT_CST_LOW (t); > +} > + > +/* T is an INTEGER_CST whose numerical value (extended according to > + TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT. Return that > + HOST_WIDE_INT. */ > + > +HOST_WIDE_INT > +tree_to_uhwi (const_tree t) > +{ > + gcc_assert (tree_fits_uhwi_p (t)); > + return TREE_INT_CST_LOW (t); > +} > + > /* Return the most significant (sign) bit of T. */ > > int > Index: gcc/tree.h > === > --- gcc/tree.h 2013-11-15 16:46:26.263399881 + > +++ gcc/tree.h 2013-11-15 16:46:56.569287095 + > @@ -3662,6 +3662,8 @@ extern bool tree_fits_uhwi_p (const_tree > #endif >; > extern HOST_WIDE_INT tree_low_cst (const_tree, int); > +extern HOST_WIDE_INT tree_to_shwi (const_tree); > +extern HOST_WIDE_INT tree_to_uhwi (const_tree); > #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003) > extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > tree_low_cst (const_tree t, int pos) > @@ -3669,6 +3671,20 @@ tree_low_cst (const_tree t, int pos) >gcc_assert (host_integerp (t, pos)); >return TREE_INT_CST_LOW (t); > } > + > +extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > +tree_to_shwi (const_tree t) > +{ > + gcc_assert (tree_fits_shwi_p (t)); > + return TREE_INT_CST_LOW (t); > +} > + > +extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > +tree_to_uhwi (const_tree t) > +{ > + gcc_assert (tree_fits_uhwi_p (t)); > + return TREE_INT_CST_LOW (t); > +} > #endif > extern int tree_int_cst_sgn (const_tree); > extern int tree_int_cst_sign_bit (const_tree);
Re: [6/10] Mechanical replacement of tree_low_cst (..., 0)
On Sat, Nov 16, 2013 at 2:05 PM, Richard Sandiford wrote: > Like patch 2, but using sed to replace "tree_low_cst (x, 0)" with > "tree_to_shwi (x)". Ok with whatever the naming bikeshedding arrives at. Richard. > Thanks, > Richard > > > gcc/c-family/ > * c-common.c, c-format.c, c-omp.c, c-pretty-print.c: Replace > tree_low_cst (..., 0) with tree_to_shwi throughout. > > gcc/c/ > * c-parser.c: Replace tree_low_cst (..., 0) with tree_to_shwi > throughout. > > gcc/cp/ > * class.c, dump.c, error.c, init.c, method.c, parser.c, semantics.c: > Replace tree_low_cst (..., 0) with tree_to_shwi throughout. > > gcc/go/ > * gofrontend/expressions.cc: Replace tree_low_cst (..., 0) with > tree_to_shwi throughout. > > gcc/java/ > * class.c, expr.c: Replace tree_low_cst (..., 0) with tree_to_shwi > throughout. > > gcc/objc/ > * objc-next-runtime-abi-02.c: Replace tree_low_cst (..., 0) with > tree_to_shwi throughout. > > gcc/ > * builtins.c, cilk-common.c, config/aarch64/aarch64.c, > config/alpha/alpha.c, config/arm/arm.c, config/c6x/predicates.md, > config/i386/i386.c, config/ia64/predicates.md, config/s390/s390.c, > coverage.c, dbxout.c, dwarf2out.c, except.c, explow.c, expr.c, expr.h, > fold-const.c, gimple-fold.c, godump.c, ipa-prop.c, omp-low.c, > predict.c, rtlanal.c, sdbout.c, stmt.c, stor-layout.c, targhooks.c, > tree-cfg.c, tree-data-ref.c, tree-inline.c, tree-ssa-forwprop.c, > tree-ssa-loop-prefetch.c, tree-ssa-phiopt.c, tree-ssa-sccvn.c, > tree-ssa-strlen.c, tree-stdarg.c, tree-vect-data-refs.c, > tree-vect-patterns.c, tree.c, tree.h, var-tracking.c, varasm.c: > Replace tree_low_cst (..., 0) with tree_to_shwi throughout. >
Re: [7/10] Mechanical replacement of tree_low_cst (..., 1)
On Sat, Nov 16, 2013 at 2:06 PM, Richard Sandiford wrote: > Like the previous patch, but for tree_low_cst (x, 1) -> tree_to_uhwi_p (x). Same ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ada/ > * gcc-interface/decl.c, gcc-interface/utils.c, gcc-interface/utils2.c: > Replace tree_low_cst (..., 1) with tree_to_uhwi throughout. > > gcc/c-family/ > * c-common.c, c-cppbuiltin.c: Replace tree_low_cst (..., 1) with > tree_to_uhwi throughout. > > gcc/c/ > * c-decl.c, c-typeck.c: Replace tree_low_cst (..., 1) with > tree_to_uhwi throughout. > > gcc/cp/ > * call.c, class.c, decl.c, error.c: Replace tree_low_cst (..., 1) with > tree_to_uhwi throughout. > > gcc/objc/ > * objc-encoding.c: Replace tree_low_cst (..., 1) with tree_to_uhwi > throughout. > > gcc/ > * alias.c, asan.c, builtins.c, cfgexpand.c, cgraph.c, > config/aarch64/aarch64.c, config/alpha/predicates.md, > config/arm/arm.c, config/darwin.c, config/epiphany/epiphany.c, > config/i386/i386.c, config/iq2000/iq2000.c, config/m32c/m32c-pragma.c, > config/mep/mep-pragma.c, config/mips/mips.c, > config/picochip/picochip.c, config/rs6000/rs6000.c, cppbuiltin.c, > dbxout.c, dwarf2out.c, emit-rtl.c, except.c, expr.c, fold-const.c, > function.c, gimple-fold.c, godump.c, ipa-cp.c, ipa-prop.c, omp-low.c, > predict.c, sdbout.c, stor-layout.c, trans-mem.c, tree-object-size.c, > tree-sra.c, tree-ssa-ccp.c, tree-ssa-forwprop.c, > tree-ssa-loop-ivcanon.c, tree-ssa-loop-ivopts.c, > tree-ssa-loop-niter.c, > tree-ssa-loop-prefetch.c, tree-ssa-strlen.c, tree-stdarg.c, > tree-switch-conversion.c, tree-vect-generic.c, tree-vect-loop.c, > tree-vect-patterns.c, tree-vrp.c, tree.c, tsan.c, ubsan.c, varasm.c: > Replace tree_low_cst (..., 1) with tree_to_uhwi throughout. >
Re: [9/10] Remove host_integerp and tree_low_cst
On Sat, Nov 16, 2013 at 2:12 PM, Richard Sandiford wrote: > Remove the old functions, which are now unused. Ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ > * tree.h (host_integerp, tree_low_cst): Delete. > * tree.c (host_integerp, tree_low_cst): Delete. > > Index: gcc/tree.h > === > --- gcc/tree.h 2013-11-16 09:35:59.381239766 + > +++ gcc/tree.h 2013-11-16 10:14:00.618868694 + > @@ -3654,11 +3654,6 @@ extern int attribute_list_contained (con > extern int tree_int_cst_equal (const_tree, const_tree); > extern int tree_int_cst_lt (const_tree, const_tree); > extern int tree_int_cst_compare (const_tree, const_tree); > -extern int host_integerp (const_tree, int) > -#ifndef ENABLE_TREE_CHECKING > - ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled. */ > -#endif > - ; > extern bool tree_fits_shwi_p (const_tree) > #ifndef ENABLE_TREE_CHECKING >ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. > */ > @@ -3669,18 +3664,10 @@ extern bool tree_fits_uhwi_p (const_tree >ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. > */ > #endif >; > -extern HOST_WIDE_INT tree_low_cst (const_tree, int); > extern HOST_WIDE_INT tree_to_shwi (const_tree); > extern HOST_WIDE_INT tree_to_uhwi (const_tree); > #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003) > extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > -tree_low_cst (const_tree t, int pos) > -{ > - gcc_assert (host_integerp (t, pos)); > - return TREE_INT_CST_LOW (t); > -} > - > -extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > tree_to_shwi (const_tree t) > { >gcc_assert (tree_fits_shwi_p (t)); > Index: gcc/tree.c > === > --- gcc/tree.c 2013-11-16 09:59:37.205620348 + > +++ gcc/tree.c 2013-11-16 10:14:00.604868554 + > @@ -6970,26 +6970,6 @@ tree_int_cst_compare (const_tree t1, con > return 0; > } > > -/* Return 1 if T is an INTEGER_CST that can be manipulated efficiently on > - the host. If POS is zero, the value can be represented in a single > - HOST_WIDE_INT. If POS is nonzero, the value must be non-negative and can > - be represented in a single unsigned HOST_WIDE_INT. */ > - > -int > -host_integerp (const_tree t, int pos) > -{ > - if (t == NULL_TREE) > -return 0; > - > - return (TREE_CODE (t) == INTEGER_CST > - && ((TREE_INT_CST_HIGH (t) == 0 > - && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0) > - || (! pos && TREE_INT_CST_HIGH (t) == -1 > - && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0 > - && !TYPE_UNSIGNED (TREE_TYPE (t))) > - || (pos && TREE_INT_CST_HIGH (t) == 0))); > -} > - > /* Return true if T is an INTEGER_CST whose numerical value (extended > according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT. */ > > @@ -7016,17 +6996,6 @@ tree_fits_uhwi_p (const_tree t) > && TREE_INT_CST_HIGH (t) == 0); > } > > -/* Return the HOST_WIDE_INT least significant bits of T if it is an > - INTEGER_CST and there is no overflow. POS is nonzero if the result must > - be non-negative. We must be able to satisfy the above conditions. */ > - > -HOST_WIDE_INT > -tree_low_cst (const_tree t, int pos) > -{ > - gcc_assert (host_integerp (t, pos)); > - return TREE_INT_CST_LOW (t); > -} > - > /* T is an INTEGER_CST whose numerical value (extended according to > TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT. Return that > HOST_WIDE_INT. */
[PATCH, MPX, 2/X] Pointers Checker [18/25] CCP (Stack store/restore)
Hi, Here is a patch to support BUILT_IN_CHKP_BNDRET and BUILT_IN_CHKP_BIND_BOUNDS in BUILT_IN_STACK_SAVE result uses. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * tree-ssa-ccp.c (insert_clobber_before_stack_restore): Handle BUILT_IN_CHKP_BNDRET and BUILT_IN_CHKP_BIND_BOUNDS calls. diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c index 50006ab..cfba927 100644 --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -1774,7 +1774,7 @@ insert_clobber_before_stack_restore (tree saved_val, tree var, gimple_htab *visited) { gimple stmt, clobber_stmt; - tree clobber; + tree clobber, fndecl; imm_use_iterator iter; gimple_stmt_iterator i; gimple *slot; @@ -1806,6 +1806,13 @@ insert_clobber_before_stack_restore (tree saved_val, tree var, else if (gimple_assign_ssa_name_copy_p (stmt)) insert_clobber_before_stack_restore (gimple_assign_lhs (stmt), var, visited); +else if (gimple_call_builtin_p (stmt, BUILT_IN_CHKP_BIND_BOUNDS)) + insert_clobber_before_stack_restore (gimple_call_lhs (stmt), var, + visited); +else if (gimple_code (stmt) == GIMPLE_CALL +&& (fndecl = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET)) +&& gimple_call_fndecl (stmt) == fndecl) + continue; else gcc_assert (is_gimple_debug (stmt)); }
Re: [8/10] Mop up remaining tree_low_cst calls
On Sat, Nov 16, 2013 at 2:10 PM, Richard Sandiford wrote: > Handle tree_low_cst references that weren't caught by the sed. Same ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ada/ > * gcc-interface/cuintp.c (UI_From_gnu): Use tree_to_shwi rather than > tree_low_cst. > > gcc/c-family/ > * c-common.c (fold_offsetof_1): Use tree_to_uhwi rather than > tree_low_cst. > (complete_array_type): Update comment to refer to tree_to_[su]hwi > rather than tree_low_cst. > > gcc/c/ > * c-decl.c (grokdeclarator): Update comment to refer to > tree_to_[su]hwi rather than tree_low_cst. > > gcc/cp/ > * decl.c (reshape_init_array_1): Use tree_to_uhwi rather than > tree_low_cst. > (grokdeclarator): Update comment to refer to tree_to_[su]hwi rather > than tree_low_cst. > > gcc/ > * expr.h: Update comments to refer to tree_to_[su]hwi rather > than tree_low_cst. > * fold-const.c (fold_binary_loc): Likewise. > * expr.c (store_constructor): Use tree_to_uhwi rather than > tree_low_cst. > * ipa-utils.h (possible_polymorphic_call_target_p): Likewise. > * stmt.c (emit_case_dispatch_table): Likewise. > * tree-switch-conversion.c (emit_case_bit_tests): Likewise. > > Index: gcc/ada/gcc-interface/cuintp.c > === > --- gcc/ada/gcc-interface/cuintp.c 2013-11-16 13:08:22.531824320 + > +++ gcc/ada/gcc-interface/cuintp.c 2013-11-16 13:08:24.254837390 + > @@ -176,9 +176,9 @@ UI_From_gnu (tree Input) > >for (i = Max_For_Dint - 1; i >= 0; i--) > { > - v[i] = tree_low_cst (fold_build1 (ABS_EXPR, gnu_type, > + v[i] = tree_to_shwi (fold_build1 (ABS_EXPR, gnu_type, > fold_build2 (TRUNC_MOD_EXPR, gnu_type, > -gnu_temp, gnu_base)), 0); > +gnu_temp, gnu_base))); >gnu_temp = fold_build2 (TRUNC_DIV_EXPR, gnu_type, gnu_temp, gnu_base); > } > > Index: gcc/c-family/c-common.c > === > --- gcc/c-family/c-common.c 2013-11-16 13:08:22.531824320 + > +++ gcc/c-family/c-common.c 2013-11-16 13:08:46.45771 + > @@ -9721,8 +9721,7 @@ fold_offsetof_1 (tree expr) > return error_mark_node; > } >off = size_binop_loc (input_location, PLUS_EXPR, DECL_FIELD_OFFSET (t), > - size_int (tree_low_cst (DECL_FIELD_BIT_OFFSET (t), > - 1) > + size_int (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (t)) > / BITS_PER_UNIT)); >break; > > @@ -10091,7 +10090,7 @@ complete_array_type (tree *ptype, tree i > { >error ("size of array is too large"); >/* If we proceed with the array type as it is, we'll eventually > -crash in tree_low_cst(). */ > +crash in tree_to_[su]hwi(). */ >type = error_mark_node; > } > > Index: gcc/c/c-decl.c > === > --- gcc/c/c-decl.c 2013-11-16 13:08:22.531824320 + > +++ gcc/c/c-decl.c 2013-11-16 13:08:24.258837421 + > @@ -5912,7 +5912,7 @@ grokdeclarator (const struct c_declarato >else > error_at (loc, "size of unnamed array is too large"); >/* If we proceed with the array type as it is, we'll eventually > -crash in tree_low_cst(). */ > +crash in tree_to_[su]hwi(). */ >type = error_mark_node; > } > > Index: gcc/cp/decl.c > === > --- gcc/cp/decl.c 2013-11-16 13:08:22.531824320 + > +++ gcc/cp/decl.c 2013-11-16 13:09:31.845353189 + > @@ -5095,8 +5095,7 @@ reshape_init_array_1 (tree elt_type, tre > max_index_cst = tree_to_uhwi (max_index); >/* sizetype is sign extended, not zero extended. */ >else > - max_index_cst = tree_low_cst (fold_convert (size_type_node, > max_index), > - 1); > + max_index_cst = tree_to_uhwi (fold_convert (size_type_node, > max_index)); > } > >/* Loop until there are no more initializers. */ > @@ -10031,7 +10030,7 @@ grokdeclarator (const cp_declarator *dec > { >error ("size of array %qs is too large", name); >/* If we proceed with the array type as it is, we'll eventually > -crash in tree_low_cst(). */ > +crash in tree_to_[su]hwi(). */ >type = error_mark_node; > } > > Index: gcc/expr.h > === > --- gcc/expr.h 2013-11-16 13:08:22.531824320 + > +++ gcc/expr.h 2013-11-16 13:08:24.263837459 + > @@ -26,8 +26,8 @@ #define GC
Re: [10/10] Make tree_to_uhwi return unsigned
On Sat, Nov 16, 2013 at 2:25 PM, Richard Sandiford wrote: > This is probably the only non-obvious part of the series. I went through > all callers to tree_to_uhwi to see whether they were used in a context > where signedness mattered. If so, I tried to adjust the casting to match. > > This mostly meant removing casts to unsigned types. There are a couple > of cases where I added casts to HOST_WIDE_INT though, to mimic the old > tree_low_cst behaviour: > > - In cfgexpand.c and trans-mem.c, where we're comparing the value > with an int PARAM_VALUE. The test isn't watertight since any > unsigned constant > HOST_WIDE_INT_MAX is going to be accepted. > That's a preexisting problem though and it can be fixed more > easily with wi:: routines. Until then this preserves the current > behaviour. > > - In the AArch32/64 and powerpc ABI handling. Here too "count" > is an int and is probably not safe for large values anyway; e.g.: > > count *= (1 + tree_to_uhwi (TYPE_MAX_VALUE (index)) > - tree_to_uhwi (TYPE_MIN_VALUE (index))); > > is done without overflow checking. This too is easier to fix > with wi::, so I've just kept it as a signed comparison for now. Ok. Thanks, Richard. > Thanks, > Richard > > > gcc/c-family/ > * c-common.c (convert_vector_to_pointer_for_subscript): Remove > cast to unsigned type. > > gcc/ > * tree.h (tree_to_uhwi): Return an unsigned HOST_WIDE_INT. > * tree.c (tree_to_uhwi): Return an unsigned HOST_WIDE_INT. > (tree_ctz): Remove cast to unsigned type. > * builtins.c (fold_builtin_memory_op): Likewise. > * dwarf2out.c (descr_info_loc): Likewise. > * godump.c (go_output_typedef): Likewise. > * omp-low.c (expand_omp_simd): Likewise. > * stor-layout.c (excess_unit_span): Likewise. > * tree-object-size.c (addr_object_size): Likewise. > * tree-sra.c (analyze_all_variable_accesses): Likewise. > * tree-ssa-forwprop.c (simplify_builtin_call): Likewise. > (simplify_rotate): Likewise. > * tree-ssa-strlen.c (adjust_last_stmt, handle_builtin_memcpy) > (handle_pointer_plus): Likewise. > * tree-switch-conversion.c (check_range): Likewise. > * tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise. > * tsan.c (instrument_builtin_call): Likewise. > * cfgexpand.c (defer_stack_allocation): Add cast to HOST_WIDE_INT. > * trans-mem.c (tm_log_add): Likewise. > * config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise. > * config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise. > * config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise. > * config/mips/mips.c (r10k_safe_mem_expr_p): Make offset unsigned. > > Index: gcc/c-family/c-common.c > === > --- gcc/c-family/c-common.c 2013-11-16 10:13:53.825800713 + > +++ gcc/c-family/c-common.c 2013-11-16 10:14:40.373263297 + > @@ -11702,8 +11702,7 @@ convert_vector_to_pointer_for_subscript > >if (TREE_CODE (index) == INTEGER_CST) > if (!tree_fits_uhwi_p (index) > -|| ((unsigned HOST_WIDE_INT) tree_to_uhwi (index) > - >= TYPE_VECTOR_SUBPARTS (type))) > +|| tree_to_uhwi (index) >= TYPE_VECTOR_SUBPARTS (type)) >warning_at (loc, OPT_Warray_bounds, "index value is out of bound"); > >c_common_mark_addressable_vec (*vecp); > Index: gcc/tree.h > === > --- gcc/tree.h 2013-11-16 10:14:00.618868694 + > +++ gcc/tree.h 2013-11-16 10:14:40.488264431 + > @@ -3665,7 +3665,7 @@ extern bool tree_fits_uhwi_p (const_tree > #endif >; > extern HOST_WIDE_INT tree_to_shwi (const_tree); > -extern HOST_WIDE_INT tree_to_uhwi (const_tree); > +extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree); > #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003) > extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > tree_to_shwi (const_tree t) > @@ -3674,7 +3674,7 @@ tree_to_shwi (const_tree t) >return TREE_INT_CST_LOW (t); > } > > -extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT > +extern inline __attribute__ ((__gnu_inline__)) unsigned HOST_WIDE_INT > tree_to_uhwi (const_tree t) > { >gcc_assert (tree_fits_uhwi_p (t)); > Index: gcc/tree.c > === > --- gcc/tree.c 2013-11-16 10:14:00.604868554 + > +++ gcc/tree.c 2013-11-16 10:14:40.488264431 + > @@ -2211,8 +2211,7 @@ tree_ctz (const_tree expr) > case LSHIFT_EXPR: >ret1 = tree_ctz (TREE_OPERAND (expr, 0)); >if (tree_fits_uhwi_p (TREE_OPERAND (expr, 1)) > - && ((unsigned HOST_WIDE_INT) tree_to_uhwi (TREE_OPERAND (expr, 1)) > - < (unsigned HOST_WIDE_INT) prec)) > + && (tree_to_uhwi (TREE_OPERAND (expr, 1)) <
[PATCH, MPX, 2/X] Pointers Checker [19/25] Strchr
Hi, Here is a patch to replace BUILT_IN_CHKP_BNDRET with assignment when optimizing strchr. Thanks, Ilya -- 2013-11-13 Ilya Enkovich * tree-ssa-strlen.c: Include tree-chkp.h. (handle_builtin_strchr): Remove retbnd call when strchr call is removed. diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index cfd7b00..711f5d7 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-pretty-print.h" #include "params.h" #include "expr.h" +#include "tree-chkp.h" /* A vector indexed by SSA_NAME_VERSION. 0 means unknown, positive value is an index into strinfo vector, negative value stands for @@ -961,6 +962,8 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi) tree src; gimple stmt = gsi_stmt (*gsi); tree lhs = gimple_call_lhs (stmt); + gimple retbnd_stmt = NULL; + tree retbnd = NULL; if (lhs == NULL_TREE) return; @@ -1009,6 +1012,14 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi) TREE_TYPE (rhs))) rhs = fold_convert_loc (loc, TREE_TYPE (lhs), rhs); } + + /* Remember passed and returned bounds if any. */ + if (gimple_call_with_bounds_p (stmt)) + { + retbnd = chkp_get_call_arg_bounds (gimple_call_arg (stmt, 0)); + retbnd_stmt = chkp_retbnd_call_by_val (lhs); + } + if (!update_call_from_tree (gsi, rhs)) gimplify_and_update_call_from_tree (gsi, rhs); stmt = gsi_stmt (*gsi); @@ -1018,6 +1029,18 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi) fprintf (dump_file, "into: "); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); } + + /* Replace retbnd call with assignment. */ + if (retbnd_stmt) + { + gimple_stmt_iterator ret_gsi = gsi_for_stmt (retbnd_stmt); + + if (!update_call_from_tree (&ret_gsi, retbnd)) + gimplify_and_update_call_from_tree (&ret_gsi, retbnd); + retbnd_stmt = gsi_stmt (ret_gsi); + update_stmt (retbnd_stmt); +} + if (si != NULL && si->endptr == NULL_TREE && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p
On Mon, Nov 18, 2013 at 11:50:45AM +0100, Richard Biener wrote: > On Sat, Nov 16, 2013 at 1:55 PM, Richard Sandiford > wrote: > > Add tree_fits_shwi_p and tree_fits_uhwi_p. The implementations are taken > > directly from host_integerp. > > tree_ is a bit generic - you only ever return true for INTEGER_CSTs, > so please use int_fits_[su]hwi_p please (mimicing int_fits_type_p). > > Ok with that change. I actually think tree_fits is better, int_fits* looks too ambiguous to me, what kind of integer (tree, int, HWI, ...) you actually mean. It also matches better say the preexisting tree_to_double_int. Jakub
Re: [5/10] Add tree_to_shwi and tree_to_uhwi
Richard Biener writes: > On Sat, Nov 16, 2013 at 2:03 PM, Richard Sandiford > wrote: >> Add tree_to_shwi and tree_to_uhwi. Initially tree_to_uhwi returns a >> HOST_WIDE_INT, so that it's a direct replacement for tree_low_cst. >> Patch 10 makes it return unsigned HOST_WIDE_INT instead. > > Possibly same naming issue (though we already do have the weird > int_cst_value) - int_to_[us]hwi ()? > > OTOH if somebody else is fine with using tree_ for the four functions > then call it a day. :-) I'm OK either way. It'd certainly be no problem to change the names. Thanks, Richard
Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p
> I actually think tree_fits is better, int_fits* looks too ambiguous to me, > what kind of integer (tree, int, HWI, ...) you actually mean. Seconded, all the more so that int_fits_type_p requires an INTEGER_CST as argument, unlike tree_fits_[su]hwi_p which accept any tree as argument. -- Eric Botcazou
Re: Re: [Patch] Fix gcc.dg/20050922-*.c
Mike, On 25/10/13 00:37, Mike Stump wrote: On Oct 24, 2013, at 2:26 AM, Vidya Praveen wrote: On Mon, Oct 21, 2013 at 06:40:28PM +0100, Mike Stump wrote: On Oct 21, 2013, at 3:28 AM, Vidya Praveen wrote: Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can be a issue especially since they define uint32_t. OK for 4.7, 4.8? It fails on arm-none-eabi. Ok, let it bake on trunk and then you can back port it if no one screams. I think it has baked long enough. Could this be approved for 4.7 and 4.8 now? VP.
Re: [0/10] Replace host_integerp and tree_low_cst
On Mon, Nov 18, 2013 at 10:08 AM, Richard Sandiford wrote: > Jeff Law writes: >> On 11/16/13 05:53, Richard Sandiford wrote: >>> After the patch that went in yesterday, all calls to host_integerp and >>> tree_low_cst pass a constant "pos" argument. This series replaces each >>> function with two separate ones: >> [ ... ] >> So I've almost entirely ignored the whole wide-int conversion discussion >> and I suspect I'm not entirely alone. >> >> Can you briefly summarize what's y'all are trying to accomplish with the >> wide-int changes? > > At the moment, we can only handle tree and rtl integer constants that > fit in 2 HOST_WIDE_INTs. The idea is to remove that limit. E.g. things > like OImode (used in a few ports) will become a first-class citizen, > with all OImode values being representable. > > Besides that headline reason, there are various side benefits. E.g.: > > - All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in > "infinite" precision, which isn't possible for 128-bit constants today. > (I.e. there's no way to distinguish signed and unsigned 128-bit constants > in a double_int.) > > - Wider-than-2-HWI intermediate results can be represented as a single > integer. I'm told this is useful for VRP. (wide-int is mostly Kenny > and Mike's work, I've just been butting in recently.) > > - rtl-level constant folding can use the same code to handle all > combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT, > on converted ports). At the moment we handle CONST_INT cases > specially, and don't try as hard with CONST_DOUBLEs. > > Implementation-wise, it tries to make it so that the common single-HWI > cases are still fast. Sadly CONST_WIDE_INTs don't get a mode: rtx immed_wide_int_const (const wide_int &v, enum machine_mode mode) { ... /* It is so tempting to just put the mode in here. Must control myself ... */ PUT_MODE (value, VOIDmode); CWI_PUT_NUM_ELEM (value, len); so much for an incentive to get more targets converted... (only after all targets are converted we possibly can merge CONST_INT and CONST_WIDE_INT). Richard. > Thanks, > Richard
Re: Clean up LTO generation
On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt wrote: > I'm looking into using LTO to pass information between compilers for > different targets, for the OpenACC implementation. This area of the > compiler seems somewhat obfuscated by overengineering, and I'd like to > simplify it first to make it easier to work with. > > lto_gimple_out and lto_finish_out aren't real passes, only their > write_summary methods are nonnull. The only thing we really do to emit > LTO is just to make two function calls, to lto_output and > produce_asm_for_decls, but this is wrapped in a lot of pass_manager > boilerplate to confuse the reader. Other downsides are bogus empty dump > files left behind, and plain dead code like > > ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes); > > The following patch simplifies this by getting rid of > all_lto_gen_passes. We could simplify a little further if we decided we > really don't need two different timevars for different steps of LTO > output. Bootstrapped and tested on x86_64-linux, ok? I'm fine with this - Honza, any objections? Thanks, Richard. > > Bernd
Re: [RFA][PATCH]Fix 59019
On Mon, Nov 18, 2013 at 5:36 AM, Jeff Law wrote: > On 11/17/13 04:28, Steven Bosscher wrote: >> >> >> TRAP_CONDITION (PATTERN (i3)) == const1_rtx >> >> But shouldn't the check be on const_true_rtx? Or does combine put a >> const1_rtx there? > > I took const1_rtx from control_flow_insn_p. That's ultimately what we need > to be consistent with. Hmm, I agree but look at the test in ifcvt.c... >> Bah... Emitting the barrier is necessary here because >> fixup_reorder_chain doesn't handle cases where a basic block is a dead >> end. That is actually a bug in fixup_reorder_chain: Other passes could >> create dead ends in the CFG in cfglayout mode and not emit a barrier >> into BB_FOOTER, and fixup_reorder_chain wouldn't be able to handle >> that (resulting in verify_flow_info failure). > > Umm, no. Failure to emit the barrier will result in a checking failure. > Been there, done that. Read my comment again: "... is necessary ..." and "... fixup_reorder_chain should emit a BARRIER ...". So yes, you need to emit that barrier with GCC as-is. But in a perfect, ideal world where everyone is happy all the time, the sun always shines, lunch is always free, and everything smells like red roses at dawn, you shouldn't have to emit a BARRIER when the compiler is in cfglayout mode. Instead, fixup_reorder_chain *should* do it for you, but obviously doesn't. There are several places where GCC code emits barriers while in cfglayout mode, and that makes no sense because barriers are meaningless in cfglayout mode :-) > To trigger use the test in 59019 with an itanic cross compiler and comment > out these two lines from gimple-ssa-isolate-paths.c: > > > TREE_THIS_VOLATILE (op) = 1; > TREE_SIDE_EFFECTS (op) = 1; OK, thanks. Let's see if I can tackle this one. Ciao! Steven
[PATCH, MPX, 2/X] Pointers Checker [20/25] Debug info
Hi, Here is a patch to avoid ICE in debug info generator for instrumented code. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE. * dwarf2out.c (gen_subprogram_die): Ignore bound args. (gen_type_die_with_usage): Skip pointer bounds. (dwarf2out_global_decl): Likewise. diff --git a/gcc/dbxout.c b/gcc/dbxout.c index 84f765e..2e97b9f 100644 --- a/gcc/dbxout.c +++ b/gcc/dbxout.c @@ -2365,6 +2365,10 @@ dbxout_type (tree type, int full) dbxout_type (TREE_TYPE (type), 0); break; +case POINTER_BOUNDS_TYPE: + /* No debug info for pointer bounds type supported yet. */ + break; + default: gcc_unreachable (); } diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 3822a65..f7e97bf 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18284,7 +18284,7 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) gen_formal_parameter_pack_die (generic_decl_parm, parm, subr_die, &parm); - else if (parm) + else if (parm && !POINTER_BOUNDS_P (parm)) { dw_die_ref parm_die = gen_decl_die (parm, NULL, subr_die); @@ -18296,6 +18296,8 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) parm = DECL_CHAIN (parm); } + else if (parm) + parm = DECL_CHAIN (parm); if (generic_decl_parm) generic_decl_parm = DECL_CHAIN (generic_decl_parm); @@ -19794,6 +19796,7 @@ gen_type_die_with_usage (tree type, dw_die_ref context_die, case FIXED_POINT_TYPE: case COMPLEX_TYPE: case BOOLEAN_TYPE: +case POINTER_BOUNDS_TYPE: /* No DIEs needed for fundamental types. */ break; @@ -20441,7 +20444,8 @@ dwarf2out_global_decl (tree decl) declarations, file-scope (extern) function declarations (which had no corresponding body) and file-scope tagged type declarations and definitions which have not yet been forced out. */ - if (TREE_CODE (decl) != FUNCTION_DECL || !DECL_INITIAL (decl)) + if ((TREE_CODE (decl) != FUNCTION_DECL || !DECL_INITIAL (decl)) + && !POINTER_BOUNDS_P (decl)) dwarf2out_decl (decl); }
Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx patterns in IVOPT
On 11/04/2013 04:31 AM, bin.cheng wrote: > 2013-11-01 Bin Cheng > > * tree-ssa-loop-ivopts.c (enum ainc_type): New. > (address_cost_data): New field. > (get_address_cost): Compute auto-increment rtx cost in ainc_costs. > Use ainc_costs for auto-increment rtx patterns. > Cleanup TWS. I think this is fine. I'd just like to see AINC_NUM gone and its use replaced by AIC_NONE, we don't really need two separate enum codes for that. Bernd
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, This modified test case exposes a bug in the already approved part of the strict-volatile-bitfields patch: #include typedef struct { char pad; int arr[0]; } __attribute__((packed)) str; str * foo (int* src) { str *s = malloc (sizeof (str) + sizeof (int)); s->arr[0] = 0x12345678; asm volatile("":::"memory"); *src = s->arr[0]; return s; } As we know this test case triggered a recursion in the store_bit_field on ARM and on PowerPC, which is no longer reproducible after this patch is applied: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html Additionally it triggered a recursion on extract_bit_field, but _only_ on my local copy of the trunk. I had this patch installed, but did not expect it to change anything unless the values are volatile. That was cased by this hunk in the strict-volatile-bitfields v4 patch: @@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo includes the entire field. If such a mode would be larger than a word, we won't be doing the extraction the normal way. */ - if (MEM_VOLATILE_P (op0) - && flag_strict_volatile_bitfields> 0) - { - if (GET_MODE_BITSIZE (GET_MODE (op0))> 0) - mode = GET_MODE (op0); - else if (target && GET_MODE_BITSIZE (GET_MODE (target))> 0) - mode = GET_MODE (target); - else - mode = tmode; - } - else - mode = get_best_mode (bitsize, bitnum, 0, 0, - MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0)); + mode = GET_MODE (op0); + if (GET_MODE_BITSIZE (mode) == 0 + || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode)) + mode = word_mode; + mode = get_best_mode (bitsize, bitnum, 0, 0, + MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) /* The only way this should occur is if the field spans word boundaries. */ return extract_split_bit_field (op0, bitsize, bitnum, unsignedp); So the problem started, because initially this function did not look at GET_MODE(op0) and always used word_mode. That was changed, but now also affected non-volatile data. Now, if we solve this differently and install the C++ memory model patch, we can avoid to introduce the recursion in the extract path, and remove these two hunks in the update patch at the same time: + else if (MEM_P (str_rtx) + && MEM_VOLATILE_P (str_rtx) + && flag_strict_volatile_bitfields> 0) + /* This is a case where -fstrict-volatile-bitfields doesn't apply + because we can't do a single access in the declared mode of the field. + Since the incoming STR_RTX has already been adjusted to that mode, + fall back to word mode for subsequent logic. */ + str_rtx = adjust_address (str_rtx, word_mode, 0); Attached you'll find a new version of the bitfields-update patch, it is again relative to the already approved version of the volatile-bitfields patch v4, part 1/2. Boot-strapped and regression-tested on X86_64-linux-gnu. additionally tested with an ARM cross-compiler. OK for trunk? Thanks Bernd.2013-11-18 Bernd Edlinger Sandra Loosemore PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 * expmed.c (strict_volatile_bitfield_p): Add bitregion_start and bitregion_end parameters. Test for compliance with C++ memory model. (store_bit_field): Adjust call to strict_volatile_bitfield_p. Add fallback logic for cases where -fstrict-volatile-bitfields is supposed to apply, but cannot. (extract_bit_field): Likewise. Use narrow_bit_field_mem and extract_fixed_bit_field_1 to do the extraction. (extract_fixed_bit_field): Revert to previous mode selection algorithm. Call extract_fixed_bit_field_1 to do the real work. (extract_fixed_bit_field_1): New function. testsuite: 2013-11-18 Bernd Edlinger Sandra Loosemore * gcc.dg/pr23623.c: Update to test interaction with C++ memory model. patch-bitfields-update-1.diff Description: Binary data
Re: [PATCH] Support -fsanitize=leak
On Fri, Nov 15, 2013 at 08:41:38PM +0100, Jakub Jelinek wrote: > Hi! > > This patch adds support for -fsanitize=leak and -static-liblsan options. > If combined with -fsanitize=address, it does nothing, otherwise it links > in liblsan, a new shared+static library (on x86_64-linux only so far, > the code isn't 32-bit ready apparently). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2013-11-14 Jakub Jelinek > > PR sanitizer/59061 > * common.opt (static-liblsan): Add. > * config/gnu-user.h (STATIC_LIBLSAN_LIBS, STATIC_LIBUBSAN_LIBS): > Define. > * flag-types.h (enum sanitize_code): Add SANITIZE_LEAK. Renumber > SANITIZE_SHIFT, SANITIZE_DIVIDE, SANITIZE_UNREACHABLE, SANITIZE_VLA. > * opts.c (common_handle_option): Handle -fsanitize=leak. > * gcc.c (ADD_STATIC_LIBLSAN_LIBS, LIBLSAN_SPEC): Define. > (LIBUBSAN_SPEC): Don't test LIBUBSAN_EARLY_SPEC. > (LIBUBSAN_EARLY_SPEC): Remove. > (SANITIZER_EARLY_SPEC): Don't do anything for libubsan. > (SANITIZER_SPEC): Add -fsanitize=leak handling. > (sanitize_spec_function): Handle %sanitize(leak). > > * configure.tgt: Set LSAN_SUPPORTED=yes for x86_64-linux. > * configure.ac (LSAN_SUPPORTED): New AM_CONDITIONAL. > * lsan/Makefile.am (toolexeclib_LTLIBRARIES, lsan_files, > liblsan_la_SOURCES, liblsan_la_LIBADD, liblsan_la_LDFLAGS): Add. > * lsan/Makefile.in: Regenerated. Looks good to me. Moreover, it fixes the bootstrap-ubsan failure when building fixincl - I guess the gnu-user.h hunk was missing. Thanks. Marek
Re: [0/10] Replace host_integerp and tree_low_cst
Richard Biener writes: > On Mon, Nov 18, 2013 at 10:08 AM, Richard Sandiford > wrote: >> Jeff Law writes: >>> On 11/16/13 05:53, Richard Sandiford wrote: After the patch that went in yesterday, all calls to host_integerp and tree_low_cst pass a constant "pos" argument. This series replaces each function with two separate ones: >>> [ ... ] >>> So I've almost entirely ignored the whole wide-int conversion discussion >>> and I suspect I'm not entirely alone. >>> >>> Can you briefly summarize what's y'all are trying to accomplish with the >>> wide-int changes? >> >> At the moment, we can only handle tree and rtl integer constants that >> fit in 2 HOST_WIDE_INTs. The idea is to remove that limit. E.g. things >> like OImode (used in a few ports) will become a first-class citizen, >> with all OImode values being representable. >> >> Besides that headline reason, there are various side benefits. E.g.: >> >> - All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in >> "infinite" precision, which isn't possible for 128-bit constants today. >> (I.e. there's no way to distinguish signed and unsigned 128-bit constants >> in a double_int.) >> >> - Wider-than-2-HWI intermediate results can be represented as a single >> integer. I'm told this is useful for VRP. (wide-int is mostly Kenny >> and Mike's work, I've just been butting in recently.) >> >> - rtl-level constant folding can use the same code to handle all >> combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT, >> on converted ports). At the moment we handle CONST_INT cases >> specially, and don't try as hard with CONST_DOUBLEs. >> >> Implementation-wise, it tries to make it so that the common single-HWI >> cases are still fast. > > Sadly CONST_WIDE_INTs don't get a mode: > > rtx > immed_wide_int_const (const wide_int &v, enum machine_mode mode) > { > ... > /* It is so tempting to just put the mode in here. Must control >myself ... */ > PUT_MODE (value, VOIDmode); > CWI_PUT_NUM_ELEM (value, len); > > so much for an incentive to get more targets converted... > (only after all targets are converted we possibly can merge CONST_INT > and CONST_WIDE_INT). Yeah, but the other requirement for merging CONST_INT and CONST_WIDE_INT is that they both treat the mode field in the same way. Which means that all existing CONST_INT code needs to be converted to store the mode first. Changing whether CONST_WIDE_INT stores a mode is trivial compared to that. Adding modes to the rtx integer constants is something I definitely want to do (and have some local patches towards), but it's going to take a while. Until then I think it would just be too confusing to have a TImode 0 stored without a mode but a TImode 1 << 100 (say) stored with a mode. In the meantime, the incentive for converting targets is so that we can stop using CONST_DOUBLE for integers. Plus the interfaces are IMO nicer with wi::... Thanks, Richard
[PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation
Hi, Here is a patch to add size relocation and instruction to obtain object's size in i386 target. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * config/i386/i386.md (UNSPEC_SIZEOF): New. (move_size_reloc_): New. * config/i386/predicates.md (size_relocation): New. (x86_64_zext_immediate_operand): Support UNSPEC_SIZEOF. * config/i386/i386.c (output_pic_addr_const): Support UNSPEC_SIZEOF. (i386_asm_output_addr_const_extra): Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index d581b96..a427c15 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13809,6 +13809,10 @@ output_pic_addr_const (FILE *file, rtx x, int code) machopic_output_function_base_name (file); break; #endif + case UNSPEC_SIZEOF: + fputs ("@SIZE", file); + break; + default: output_operand_lossage ("invalid UNSPEC as operand"); break; @@ -15387,6 +15391,11 @@ i386_asm_output_addr_const_extra (FILE *file, rtx x) } break; +case UNSPEC_SIZEOF: + output_addr_const (file, op); + fputs ("@SIZE", file); + break; + default: return false; } diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e23b3b6..97dd89c 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -79,6 +79,7 @@ UNSPEC_PLTOFF UNSPEC_MACHOPIC_OFFSET UNSPEC_PCREL + UNSPEC_SIZEOF ;; Prologue support UNSPEC_STACK_ALLOC @@ -18446,6 +18447,14 @@ "bndstx\t{%2, %3|%3, %2}" [(set_attr "type" "mpxst")]) +(define_insn "move_size_reloc_" + [(set (match_operand:SWI48 0 "register_operand" "=r") +(match_operand: 1 "size_relocation" "Z"))] + "" +{ + return "mov{}\t{%1, %0|%0, %1}"; +}) + (include "mmx.md") (include "sse.md") (include "sync.md") diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 66ac52f..23231b1 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -119,6 +119,15 @@ (match_test "TARGET_64BIT") (match_test "REGNO (op) > BX_REG"))) +;; Return true if VALUE is size relocation +(define_predicate "size_relocation" + (match_code "const") +{ + return (GET_CODE (op) == CONST + && GET_CODE (XEXP (op, 0)) == UNSPEC + && XINT (XEXP (op, 0), 1) == UNSPEC_SIZEOF); +}) + ;; Return true if VALUE can be stored in a sign extended immediate field. (define_predicate "x86_64_immediate_operand" (match_code "const_int,symbol_ref,label_ref,const") @@ -323,6 +332,13 @@ return false; } } + else if (GET_CODE (XEXP (op, 0)) == UNSPEC) +{ + if (XINT (XEXP (op, 0), 1) == UNSPEC_SIZEOF + && XVECLEN (XEXP (op, 0), 0) == 1 + && GET_CODE (XVECEXP (XEXP (op, 0), 0, 0)) == SYMBOL_REF) + return true; +} break; default:
Re: Clean up LTO generation
> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt > wrote: > > I'm looking into using LTO to pass information between compilers for > > different targets, for the OpenACC implementation. This area of the > > compiler seems somewhat obfuscated by overengineering, and I'd like to > > simplify it first to make it easier to work with. > > > > lto_gimple_out and lto_finish_out aren't real passes, only their > > write_summary methods are nonnull. The only thing we really do to emit > > LTO is just to make two function calls, to lto_output and > > produce_asm_for_decls, but this is wrapped in a lot of pass_manager > > boilerplate to confuse the reader. Other downsides are bogus empty dump > > files left behind, and plain dead code like > > > > ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes); > > > > The following patch simplifies this by getting rid of > > all_lto_gen_passes. We could simplify a little further if we decided we > > really don't need two different timevars for different steps of LTO > > output. Bootstrapped and tested on x86_64-linux, ok? > > I'm fine with this - Honza, any objections? Not at all, I alos considered this somewhat pointless. Honza > > Thanks, > Richard. > > > > > Bernd
[PATCH] Fix libbacktrace on prelinked shared libraries
Hi! info->dlpi_addr == 0 is true for executables (but those have also info->dlpi_name set to ""), but not just for those, it is true also for successfully prelinked shared libraries or any other libraries that have been linked at non-zero offset and the dynamic linker managed to mmap them where they were prelinked resp. linked to. BTW, I'm surprised you aren't adding base_address (well, in fact it is actually not base address, but load bias, difference between load address and link base address) to elf_initialize_syminfo created symbol addresses, because all of them should be biased by the load address. And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols, those shouldn't be biased but not even entered into the table. 2013-11-18 Jakub Jelinek * elf.c (phdr_callback): Process info->dlpi_addr == 0 normally. --- libbacktrace/elf.c.jj 2013-11-18 09:59:08.0 +0100 +++ libbacktrace/elf.c 2013-11-18 12:48:07.780700382 +0100 @@ -863,12 +863,8 @@ phdr_callback (struct dl_phdr_info *info fileline elf_fileline_fn; int found_dwarf; - /* There is not much we can do if we don't have the module name. If - the base address is 0, this is probably the executable, which we - already loaded. */ - if (info->dlpi_name == NULL - || info->dlpi_name[0] == '\0' - || info->dlpi_addr == 0) + /* There is not much we can do if we don't have the module name. */ + if (info->dlpi_name == NULL || info->dlpi_name[0] == '\0') return 0; descriptor = backtrace_open (info->dlpi_name, pd->error_callback, pd->data, Jakub
Re: Clean up LTO generation
On 11/18/2013 02:03 PM, Jan Hubicka wrote: >> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt >> wrote: >>> The following patch simplifies this by getting rid of >>> all_lto_gen_passes. We could simplify a little further if we decided we >>> really don't need two different timevars for different steps of LTO >>> output. Bootstrapped and tested on x86_64-linux, ok? >> >> I'm fine with this - Honza, any objections? > > Not at all, I alos considered this somewhat pointless. Ok, cool. Any opinions on the two timevars, should I reduce that before committing? Bernd
Re: [PATCH] Fix PRs59125 and 54570
On Fri, 15 Nov 2013, Richard Biener wrote: > On Fri, 15 Nov 2013, Jakub Jelinek wrote: > > > On Fri, Nov 15, 2013 at 02:56:51PM +0100, Richard Biener wrote: > > > Now that there is (finally :() a wrong-code testcase for the > > > PR54570 issue we can no longer ignore it (bah). So the following > > > tries to paper over the fact that object-size sucks and disables > > > value-numbering of equal addresses the same before that pass > > > had a chance to finally look at the structure of the addresses. > > > > > > To make this "fix" suck less I moved the object-size pass before > > > the final FRE pass runs which is after IPA inlining and the > > > propagation of constants and addresses. You won't catch > > > any "improvements" you'd get by memory CSE opportunities that > > > IPA inlining exposes, but you cannot have everything here. > > > > If it doesn't regress anything in the testsuite, I guess that is ok. > > > > > (IMHO object-size would should run during early optimizations) > > > > It can't, because inlining and some limited cleanup afterwards is essential > > for it. Otherwise you'd regress not just for __builtin_object_size (x, 1), > > which admittedly is problematic since the introduction of MEM_REFs and > > various other changes, but also for __builtin_object_size (x, 0), which > > would be much more important. > > > > As discussed earlier, perhaps instead of checking cfun->after_inlining > > you could just introduce a new flag whether cfun contains any > > __builtin_object_size (x, {1,3}) calls, initialized by the gimplifier, > > propagated by the inliner and finally cleared again by objsz pass. > > But you'd need to pessimistically initialize it because if you inline > into a function with __builtin_object_size you may not previously > optimize. You can of course analyze the cgraph to clear it for > functions that cannot end up being inlined into such function. > So that's effectively the same as ->after_inlining > minus losing the optimization that didn't end up with > __builtin_object_size after that. > > Not sure if it's worth all the trouble. Arriving at a better > design for computing __builtin_object_size would be better ;) > (not that I have one) > > > Of course, if moving objsz earlier seems to work, it could stay where you > > put it, but the flag could make it clearer why you want to avoid certain > > optimizations. > > Well, all object-size testcases are pretty simplistic right now and > don't trigger the IPA inliner for example. > > > > Bootstrap / regtest pending on x86_64-unknown-linux-gnu. > > > > > > Similar candidate for the 4.8 branch. > > > > Please wait sufficiently long for trunk issues before you backport. > > Of course. So I had to do some more changes because doing objsz earlier (and separating it from strlenopt) exposes that the GENERIC folders (ugh) for strcat mess up points-to info as it folds strcat to memcpy (SAVE_EXPR , ...) which adds a new SSA name for the destination with no alias info associated. Rather than moving the non-constant builtin foldings to GIMPLE I chose to make sure to forward the constants in objsz and fold the stmts there (as PTA is re-run shortly after it). This then breaks some of the scan-tree-dumps in the strlenopt testcases because nobody has previously folded the _chk builtins with -1U size to non-_chk variants, so I have to adjust them (I didn't want to move the strlenopt pass as well). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Any objections? Thanks, Richard. 2013-11-18 Richard Biener PR tree-optimization/59125 PR tree-optimization/54570 * tree-ssa-sccvn.c (copy_reference_ops_from_ref): When inlining is not complete do not treat component-references with offset zero but different fields as equal. * tree-object-size.c: Include tree-phinodes.h and ssa-iterators.h. (compute_object_sizes): Apply TLC. Propagate the constant results into all uses and fold their stmts. * passes.def (pass_all_optimizations): Move pass_object_sizes after the first pass_forwprop and before pass_fre. * gcc.dg/builtin-object-size-8.c: Un-xfail. * gcc.dg/builtin-object-size-14.c: New testcase. * gcc.dg/strlenopt-14gf.c: Adjust. * gcc.dg/strlenopt-1f.c: Likewise. * gcc.dg/strlenopt-4gf.c: Likewise. Index: gcc/tree-ssa-sccvn.c === *** gcc/tree-ssa-sccvn.c.orig 2013-11-18 11:20:36.0 +0100 --- gcc/tree-ssa-sccvn.c2013-11-18 11:29:12.300337512 +0100 *** copy_reference_ops_from_ref (tree ref, v *** 760,766 } /* For non-calls, store the information that makes up the address. */ ! while (ref) { vn_reference_op_s temp; --- 760,766 } /* For non-calls, store the information that makes up the address. */ ! tree orig = ref; while (ref) { vn_ref
Re: Clean up LTO generation
On Mon, Nov 18, 2013 at 2:19 PM, Bernd Schmidt wrote: > On 11/18/2013 02:03 PM, Jan Hubicka wrote: >>> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt >>> wrote: The following patch simplifies this by getting rid of all_lto_gen_passes. We could simplify a little further if we decided we really don't need two different timevars for different steps of LTO output. Bootstrapped and tested on x86_64-linux, ok? >>> >>> I'm fine with this - Honza, any objections? >> >> Not at all, I alos considered this somewhat pointless. > > Ok, cool. Any opinions on the two timevars, should I reduce that before > committing? No, it's useful to have both separated. Thanks, Richard. > > Bernd >
[PATCH] Use libbacktrace as libsanitizer's symbolizer
Hi! This patch adds libbacktrace symbolizer to libsanitizer, with which we can avoid spawning and running external an external program (especially when it is not even provided by gcc itself). I've kept the possibility to override the symbolizer by magic symbols (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc. works and what is the possibility there to add libbacktrace, this just requires users to compile with SANITIZE_LIBBACKTRACE defined to signal that backtrace-supported.h and backtrace.h is available and the sanitizer_symbolizer_posix_libcdep.cc source then decides based on backtrace-supported.h etc. whether it is usable. make check RUNTESTFLAGS='asan.exp ubsan.exp' passes with this. Some pending issues on the libbacktrace side: 1) right now libbacktrace performs dl_iterate_phdr only the first time backtrace_pcinfo or backtrace_syminfo is called, if there are some dlopens/dlcloses in between that and another querying of the symbolizer, it won't notice that. Perhaps it can be done only when we don't find a symbol and/or have some function that tries to dl_iterare_phdr again, look at cached st_ino/st_mtime or similar, and for threaded version likely just keep old records, just add a flag to them that they should be ignored (or say atomically decrease symbol count to zero and something similar for debug info). 2) for tsan querying of data symbols, apparently the classes want to see not just the symbol name and start value, but also size. libbacktrace has all this info available, just doesn't pass it down to the callback. I wonder if we'd need to create yet another libbacktrace entrypoint, or if it would be acceptable to do source code incompatible, ABI (at least on all sane targets) compatible version of just adding another uintptr_t symsize argument to backtrace_syminfo_callback. 3) I wonder if libbacktrace couldn't be updated to use __atomic_* builtins, then it could avoid the ugliness to emulate atomic loads and stores. As for sanitizer, the reason I haven't implemented SendCommand method for the libbacktrace symbolizer is that the library doesn't provide the info as text, but as individual values passed to the callback, so printing that to text and then parsing the text would be very ugly. libbacktrace doesn't even need the module names and module offsets, so supposedly we would need that only if libbacktrace failed to get accurate inline/call or symbol info. While the classes have both symbol name/file/line etc. and module name/offset fields, apparently the latter are unused if the former is filled in. 2013-11-18 Jakub Jelinek PR sanitizer/59136 * configure.ac: Don't add target-libbacktrace to noconfigdirs just because go hasn't been enabled if target-libsanitizer isn't in noconfigdirs. * Makefile.def: Add configure-target-libsanitizer dependency on configure-target-libbacktrace and all-target-libsanitizer dependency on configure-target-libsanitizer. * configure: Regenerated. libsanitizer/ * sanitizer_common/Makefile.am (AM_CPPFLAGS): If SANITIZER_LIBBACKTRACE, append -I for libbacktrace headers and -DSANITIZER_LIBBACKTRACE. * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Add libbacktrace symbolizer. * tsan/Makefile.am (libtsan_la_LIBADD): Add libbacktrace.la if SANITIZER_LIBBACKTRACE. * asan/Makefile.am (libasan_la_LIBADD): Likewise. * ubsan/Makefile.am (libubsan_la_LIBADD): Likewise. * configure.ac (SANITIZER_LIBBACKTRACE): New AM_CONDITIONAL. * sanitizer_common/Makefile.in: Regenerated. * tsan/Makefile.in: Regenrated. * asan/Makefile.in: Regenerated. * ubsan/Makefile.in: Regenerated. * configure: Regenerated. gcc/testsuite/ * c-c++-common/asan/strip-path-prefix-1.c: Allow also the filename:line instead of (modulename+offset) form with stripped initial / from the filename. --- libsanitizer/sanitizer_common/Makefile.am.jj2013-11-18 09:59:04.012244260 +0100 +++ libsanitizer/sanitizer_common/Makefile.am 2013-11-18 11:04:54.241360825 +0100 @@ -1,5 +1,8 @@ -AM_CPPFLAGS = -I $(top_srcdir)/include - +AM_CPPFLAGS = -I $(top_srcdir)/include +if SANITIZER_LIBBACKTRACE +AM_CPPFLAGS += -I $(top_srcdir)/../libbacktrace -I $(top_builddir)/../libbacktrace -DSANITIZER_LIBBACKTRACE +endif + # May be used by toolexeclibdir. gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) --- libsanitizer/sanitizer_common/Makefile.in.jj2013-11-18 09:59:04.012244260 +0100 +++ libsanitizer/sanitizer_common/Makefile.in 2013-11-18 12:55:52.709310820 +0100 @@ -35,6 +35,7 @@ POST_UNINSTALL = : build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ +@SANITIZER_LIBBACKTRACE_TRUE@am__append_1 = -I $(top_srcdir)/../libbacktrace -I $(top_builddir)/../libbacktrace -DSANITIZER_LIBBACKTRACE
Re: [PATCH] Implement -fsanitize=null + new sanopt pass
On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote: > --- gcc/config/bootstrap-ubsan.mk.mp 2013-11-12 13:46:13.345182065 +0100 > +++ gcc/config/bootstrap-ubsan.mk 2013-11-12 13:46:49.812314016 +0100 > @@ -2,6 +2,6 @@ > > STAGE2_CFLAGS += -fsanitize=undefined > STAGE3_CFLAGS += -fsanitize=undefined > -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \ > +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl \ Hopefully with my pending patch you can remove the -lpthread -ldl again, but ok for now. > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple stmt = gsi_stmt (gsi); > + > + if (gimple_code (stmt) != GIMPLE_CALL) if (is_gimple_call (stmt)) Ok with those changes. Jakub
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
On Mon, Nov 18, 2013 at 1:02 AM, Arnaud Charlet wrote: >> Actually there are two issues with your change: >> >> Using time_t for tv_nsec looks actually wrong, the definition on (my) >> linux is: >> >> struct timespec { >> __kernel_time_t tv_sec; /* seconds */ >> longtv_nsec;/* nanoseconds */ >> }; This is wrong for x32 where tv_nsec is long long, not long. There are a couple places where long should be long long for x32. Glibc gets them right properly: /* A time value that is accurate to the nearest microsecond but also has a range of years. */ struct timeval { __time_t tv_sec;/* Seconds. */ __suseconds_t tv_usec; /* Microseconds. */ }; >> As you can see, the types for tv_sec and tv_nsec are different. >> >> So the change to timespec looks wrong to me. > > In addition concerning s-taprop-linux.adb, the definition of struct > timeval on linux is: > > /* A time value that is accurate to the nearest >microsecond but also has a range of years. */ > struct timeval > { > __time_t tv_sec;/* Seconds. */ > __suseconds_t tv_usec; /* Microseconds. */ > }; > > So again, we have another type (__suseconds_t), and NOT time_t or __time_t > so it's really wrong for tv_nsec to use time_t. > All for Linux architectures, __suseconds_t and __suseconds_t are the same as time_t, including x32. If not, we can't use array for timeval. -- H.J.
[v3 patch] __shared_ptr should never use atomic ops
This is something I've been meaning to change for years, see http://gcc.gnu.org/ml/libstdc++/2007-10/msg00180.html for a prior post. Currently __shared_ptr specializations still dispatch through __gnu_cxx::__exchange_and_add_dispatch for ref-count updates. The attached patch means the _S_single policy is always non-atomic, so that you can get a faster (but not threadsafe!) smart pointer with: template using shared_ptr_unsynchronized = std::__shared_ptr; This is something I've needed and I've seen others ask about it too e.g. http://stackoverflow.com/q/15129263/981959 http://stackoverflow.com/q/9792668/981959 2013-11-18 Jonathan Wakely * include/bits/shared_ptr_base.h (_Sp_counted_base<_S_single>): Use non-atomic operations. * testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line number. * testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise. Tested x86_64-linux, with --enable-threads and --disable-threads, committed to trunk. commit 1f4fcb52da0e14389ffd53bca6dc060a30d180c2 Author: Jonathan Wakely Date: Mon Nov 18 11:50:19 2013 + * include/bits/shared_ptr_base.h (_Sp_counted_base<_S_single>): Use non-atomic operations. * testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line number. * testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise. diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index cf90d7a..68ccc9e 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -209,11 +209,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Sp_counted_base<_S_single>:: _M_add_ref_lock() { - if (__gnu_cxx::__exchange_and_add_dispatch(&_M_use_count, 1) == 0) - { - _M_use_count = 0; - __throw_bad_weak_ptr(); - } + if (_M_use_count == 0) + __throw_bad_weak_ptr(); + ++_M_use_count; } template<> @@ -248,6 +246,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __ATOMIC_RELAXED)); } + template<> +inline void +_Sp_counted_base<_S_single>::_M_add_ref_copy() +{ ++_M_use_count; } + + template<> +inline void +_Sp_counted_base<_S_single>::_M_release() noexcept +{ + if (--_M_use_count == 0) +{ + _M_dispose(); + if (--_M_weak_count == 0) +_M_destroy(); +} +} + + template<> +inline void +_Sp_counted_base<_S_single>::_M_weak_add_ref() noexcept +{ ++_M_weak_count; } + + template<> +inline void +_Sp_counted_base<_S_single>::_M_weak_release() noexcept +{ + if (--_M_weak_count == 0) +_M_destroy(); +} + + template<> +inline long +_Sp_counted_base<_S_single>::_M_get_use_count() const noexcept +{ return _M_use_count; } + // Forward declarations. template diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc index db3fcac..01acca5 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc @@ -32,7 +32,7 @@ void test01() { X* px = 0; std::shared_ptr p1(px); // { dg-error "here" } - // { dg-error "incomplete" "" { target *-*-* } 779 } + // { dg-error "incomplete" "" { target *-*-* } 812 } std::shared_ptr p9(ap()); // { dg-error "here" } // { dg-error "incomplete" "" { target *-*-* } 307 } diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc index 3fd38cf..5389159 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc @@ -25,5 +25,5 @@ void test01() { std::shared_ptr p((void*)nullptr); // { dg-error "here" } - // { dg-error "incomplete" "" { target *-*-* } 778 } + // { dg-error "incomplete" "" { target *-*-* } 811 } }
Re: [PATCH] Implement -fsanitize=null + new sanopt pass
On Mon, Nov 18, 2013 at 02:51:41PM +0100, Jakub Jelinek wrote: > On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote: > > --- gcc/config/bootstrap-ubsan.mk.mp2013-11-12 13:46:13.345182065 > > +0100 > > +++ gcc/config/bootstrap-ubsan.mk 2013-11-12 13:46:49.812314016 +0100 > > @@ -2,6 +2,6 @@ > > > > STAGE2_CFLAGS += -fsanitize=undefined > > STAGE3_CFLAGS += -fsanitize=undefined > > -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \ > > +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl > > \ > > Hopefully with my pending patch you can remove the -lpthread -ldl again, but > ok for now. > > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > + { > > + gimple stmt = gsi_stmt (gsi); > > + > > + if (gimple_code (stmt) != GIMPLE_CALL) > > if (is_gimple_call (stmt)) > > Ok with those changes. Oh, one more thing, please update gcc/doc/, the -fsanitize= description is far from up to date there. Jakub
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
> >> struct timespec { > >> __kernel_time_t tv_sec; /* seconds */ > >> longtv_nsec;/* nanoseconds */ > >> }; > > This is wrong for x32 where tv_nsec is long long, not > long. There are a couple places where long should be > long long for x32. Well yes, but the type is *not* time_t, it's conceptually a different type. > Glibc gets them right properly: But that's for timeval, not timespec. > All for Linux architectures, __suseconds_t and > __suseconds_t are the same as time_t, including > x32. If not, we can't use array for timeval. Well this PR is about x32 being different from all other linux ports. There's no guarantee that we won't have exactly the same issue on future ports where __suseconds_t is *not* the same as time_t. Also there's the issue of s-osinte-solaris-posix.ads which is broken with your change right now. The short term solution would be to apply this change: --- s-osinte-solaris-posix.ads (revision 298928) +++ s-osinte-solaris-posix.ads (working copy) @@ -513,7 +513,7 @@ type timespec is record tv_sec : time_t; - tv_nsec : long; + tv_nsec : time_t; end record; pragma Convention (C, timespec); But I'd rather settle on which type is appropriate for the tv_nsec field before making this change. Arno
Re: [PATCH] Support -fsanitize=leak
On Fri, Nov 15, 2013 at 12:34:14PM -0800, Ian Lance Taylor wrote: > On Fri, Nov 15, 2013 at 11:41 AM, Jakub Jelinek wrote: > > > > This patch adds support for -fsanitize=leak and -static-liblsan options. > > If combined with -fsanitize=address, it does nothing, otherwise it links > > in liblsan, a new shared+static library (on x86_64-linux only so far, > > the code isn't 32-bit ready apparently). > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Documentation? Here it is, as incremental patch: 2013-11-18 Jakub Jelinek * doc/invoke.texi (-static-liblsan, -fsanitize=leak): Document. --- gcc/doc/invoke.texi.jj 2013-11-18 09:59:09.0 +0100 +++ gcc/doc/invoke.texi 2013-11-18 14:57:12.240073775 +0100 @@ -454,7 +454,7 @@ Objective-C and Objective-C++ Dialects}. @gccoptlist{@var{object-file-name} -l@var{library} @gol -nostartfiles -nodefaultlibs -nostdlib -pie -rdynamic @gol -s -static -static-libgcc -static-libstdc++ @gol --static-libasan -static-libtsan -static-libubsan @gol +-static-libasan -static-libtsan -static-liblsan -static-libubsan @gol -shared -shared-libgcc -symbolic @gol -T @var{script} -Wl,@var{option} -Xlinker @var{option} @gol -u @var{symbol}} @@ -5259,6 +5259,13 @@ Memory access instructions will be instr data race bugs. See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for more details. +@item -fsanitize=leak +Enable LeakSanitizer, a memory leak detector. +This option only matters for linking of executables and if neither +@option{-fsanitize=address} nor @option{-fsanitize=thread} is used. In that +case it will link the executable against a library that overrides @code{malloc} +and other allocator functions. + @item -fsanitize=undefined Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector Various computations will be instrumented to detect undefined behavior @@ -10181,6 +10188,15 @@ option is not used, then this links agai driver to link @file{libtsan} statically, without necessarily linking other libraries statically. +@item -static-liblsan +When the @option{-fsanitize=leak} option is used to link a program, +the GCC driver automatically links against @option{liblsan}. If +@file{liblsan} is available as a shared library, and the @option{-static} +option is not used, then this links against the shared version of +@file{liblsan}. The @option{-static-liblsan} option directs the GCC +driver to link @file{liblsan} statically, without necessarily linking +other libraries statically. + @item -static-libubsan When the @option{-fsanitize=undefined} option is used to link a program, the GCC driver automatically links against @option{libubsan}. If Jakub
Re: [PATCH] Support -fsanitize=leak
On Mon, Nov 18, 2013 at 01:15:02PM +0100, Marek Polacek wrote: > Moreover, it fixes the bootstrap-ubsan failure when > building fixincl - I guess the gnu-user.h hunk was missing. Thanks. Sorry, turned out this is bogus. It really will need some configure/Makefile fix. Marek
[PATCH, i386, MPX, 2/X] Pointers Checker [22/25] Target builtins
Hi, Here is a patch introducing i386 target versions of Pointer Bounds Checker builtins. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDSET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_ARG_BND, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_expand_builtin): expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDSET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_ARG_BND, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def index c866170..f82ac9b 100644 --- a/gcc/config/i386/i386-builtin-types.def +++ b/gcc/config/i386/i386-builtin-types.def @@ -47,6 +47,7 @@ DEF_PRIMITIVE_TYPE (UCHAR, unsigned_char_type_node) DEF_PRIMITIVE_TYPE (QI, char_type_node) DEF_PRIMITIVE_TYPE (HI, intHI_type_node) DEF_PRIMITIVE_TYPE (SI, intSI_type_node) +DEF_PRIMITIVE_TYPE (BND, pointer_bounds_type_node) # ??? Logically this should be intDI_type_node, but that maps to "long" # with 64-bit, and that's not how the emmintrin.h is written. Again, # changing this would change name mangling. @@ -60,6 +61,7 @@ DEF_PRIMITIVE_TYPE (USHORT, short_unsigned_type_node) DEF_PRIMITIVE_TYPE (INT, integer_type_node) DEF_PRIMITIVE_TYPE (UINT, unsigned_type_node) DEF_PRIMITIVE_TYPE (UNSIGNED, unsigned_type_node) +DEF_PRIMITIVE_TYPE (ULONG, long_unsigned_type_node) DEF_PRIMITIVE_TYPE (LONGLONG, long_long_integer_type_node) DEF_PRIMITIVE_TYPE (ULONGLONG, long_long_unsigned_type_node) DEF_PRIMITIVE_TYPE (UINT8, unsigned_char_type_node) @@ -239,6 +241,7 @@ DEF_FUNCTION_TYPE (V4DI, V8HI) DEF_FUNCTION_TYPE (V4DI, V4SI) DEF_FUNCTION_TYPE (V4DI, PV4DI) DEF_FUNCTION_TYPE (V4DI, V2DI) +DEF_FUNCTION_TYPE (BND, PCVOID, ULONG) DEF_FUNCTION_TYPE (DI, V2DI, INT) DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT) @@ -374,6 +377,7 @@ DEF_FUNCTION_TYPE (VOID, PV4DI, V4DI) DEF_FUNCTION_TYPE (VOID, PV4SF, V4SF) DEF_FUNCTION_TYPE (VOID, PV8SF, V8SF) DEF_FUNCTION_TYPE (VOID, UNSIGNED, UNSIGNED) +DEF_FUNCTION_TYPE (VOID, BND, PCVOID) DEF_FUNCTION_TYPE (INT, V16QI, V16QI, INT) DEF_FUNCTION_TYPE (UCHAR, UINT, UINT, UINT) @@ -439,6 +443,14 @@ DEF_FUNCTION_TYPE (V8UHI, V8UHI, V8UHI, V8UHI) DEF_FUNCTION_TYPE (V16UQI, V16UQI, V16UQI, V16UQI) DEF_FUNCTION_TYPE (V4DF, V4DF, V4DF, V4DI) DEF_FUNCTION_TYPE (V8SF, V8SF, V8SF, V8SI) +DEF_FUNCTION_TYPE (VOID, PCVOID, PCVOID, BND) +DEF_FUNCTION_TYPE (BND, PCVOID, PCVOID) +DEF_FUNCTION_TYPE (BND, PCVOID) +DEF_FUNCTION_TYPE (BND, BND, BND) +DEF_FUNCTION_TYPE (PVOID, PVOID, PVOID, ULONG) +DEF_FUNCTION_TYPE (PVOID, PCVOID, BND, ULONG) +DEF_FUNCTION_TYPE (ULONG, VOID) +DEF_FUNCTION_TYPE (PVOID, BND) DEF_FUNCTION_TYPE (V2DI, V2DI, V2DI, UINT, UINT) DEF_FUNCTION_TYPE (V4HI, HI, HI, HI, HI) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index a427c15..6ddd37a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -64,6 +64,8 @@ along with GCC; see the file COPYING3. If not see #include "tree-pass.h" #include "context.h" #include "pass_manager.h" +#include "tree-chkp.h" +#include "rtl-chkp.h" static rtx legitimize_dllimport_symbol (rtx, bool); static rtx legitimize_pe_coff_extern_decl (rtx, bool); @@ -27742,6 +27744,21 @@ enum ix86_builtins IX86_BUILTIN_XABORT, IX86_BUILTIN_XTEST, + /* MPX */ + IX86_BUILTIN_BNDMK, + IX86_BUILTIN_BNDSTX, + IX86_BUILTIN_BNDLDX, + IX86_BUILTIN_BNDCL, + IX86_BUILTIN_BNDCU, + IX86_BUILTIN_BNDRET, + IX86_BUILTIN_BNDSET, + IX86_BUILTIN_BNDNARROW, + IX86_BUILTIN_BNDINT, + IX86_BUILTIN_ARG_BND, + IX86_BUILTIN_SIZEOF, + IX86_BUILTIN_BNDLOWER, + IX86_BUILTIN_BNDUPPER, + /* BMI instructions. */ IX86_BUILTIN_BEXTR32, IX86_BUILTIN_BEXTR64, @@ -27811,6 +27828,8
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
On Mon, Nov 18, 2013 at 5:57 AM, Arnaud Charlet wrote: >> >> struct timespec { >> >> __kernel_time_t tv_sec; /* seconds */ >> >> longtv_nsec;/* nanoseconds */ >> >> }; >> >> This is wrong for x32 where tv_nsec is long long, not >> long. There are a couple places where long should be >> long long for x32. > > Well yes, but the type is *not* time_t, it's conceptually a different type. > >> Glibc gets them right properly: > > But that's for timeval, not timespec. Glibc has struct timespec { __time_t tv_sec;/* Seconds. */ __syscall_slong_t tv_nsec; /* Nanoseconds. */ }; __syscall_slong_t is long long for x32. >> All for Linux architectures, __suseconds_t and >> __suseconds_t are the same as time_t, including >> x32. If not, we can't use array for timeval. > > Well this PR is about x32 being different from all other linux ports. There's > no guarantee that we won't have exactly the same issue on future ports where > __suseconds_t is *not* the same as time_t. Ada was using long for time_t and type timeval is array (1 .. 2) of C.long It assumes that the type of tv_nsec is the same as tv_sec. > Also there's the issue of s-osinte-solaris-posix.ads which is broken with > your change right now. > > The short term solution would be to apply this change: > > --- s-osinte-solaris-posix.ads (revision 298928) > +++ s-osinte-solaris-posix.ads (working copy) > @@ -513,7 +513,7 @@ > > type timespec is record >tv_sec : time_t; > - tv_nsec : long; > + tv_nsec : time_t; > end record; > pragma Convention (C, timespec); > > But I'd rather settle on which type is appropriate for the tv_nsec > field before making this change. > __syscall_slong_t is a Linux specific type. We can add tv_nsec_t, which should be the same as time_t for all the current targets. -- H.J.
Use anti-ranges in memcpy/memset expansion
Hi, this patch adds support to anti-ranges to determine_block_size. This makes it possible to bound size of block from bellow that is useful to avoid need for small size code path and it also allows us to inline the following: void *a; void *b; t(int c) { if (c<10) memcpy (a,b,c); } Because C is signed, we do not really know that the block is smaller than 10, but it is most likely the case. Bootstrapped/regtested x86_64-linux OK? Honza * md.texi (setmem): Document new parameter. * optabs.c (maybe_gen_insn): Support 9 operands. * builtins.c (determine_block_size): Add probable_max_size; support anti-ranges. (expand_builtin_memcpy. expand_builtin_memset_args): Pass around probable_max_size. * expr.c (emit_block_move_via_movmem, emit_block_move_hints, emit_block_move, clear_storage_hints, set_storage_via_setmem): Likewise. * expr.h (emit_block_move_hints, clear_storage_hints, set_storage_via_setmem): Update prototype. * i386.md (setmem, movmem patterns): Add 9th operand. * i386-protos.h (ix86_expand_set_or_movmem): Update prototype. * i386.c (ix86_expand_set_or_movmem): Take probable_max_size_exp argument; pass it to decide_alg. * gcc.target/i386/memcpy-3.c: New testcase. Index: doc/md.texi === *** doc/md.texi (revision 204945) --- doc/md.texi (working copy) *** all cases. This expected alignment is al *** 5352,5357 --- 5352,5359 Expected size, when unknown, is set to @code{(const_int -1)}. Operand 7 is the minimal size of the block and operand 8 is the maximal size of the block (NULL if it can not be represented as CONST_INT). + Operand 9 is the probable maximal size (i.e. we can not rely on it for correctness, + but it can be used for choosing proper code sequence for a given size). The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}. Index: optabs.c === *** optabs.c(revision 204945) --- optabs.c(working copy) *** maybe_gen_insn (enum insn_code icode, un *** 8229,8234 --- 8229,8238 return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value, ops[3].value, ops[4].value, ops[5].value, ops[6].value, ops[7].value); + case 9: + return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value, + ops[3].value, ops[4].value, ops[5].value, + ops[6].value, ops[7].value, ops[8].value); } gcc_unreachable (); } Index: builtins.c === *** builtins.c (revision 204945) --- builtins.c (working copy) *** builtin_memcpy_read_str (void *data, HOS *** 3096,3107 } /* LEN specify length of the block of memcpy/memset operation. !Figure out its range and put it into MIN_SIZE/MAX_SIZE. */ static void determine_block_size (tree len, rtx len_rtx, unsigned HOST_WIDE_INT *min_size, ! unsigned HOST_WIDE_INT *max_size) { if (CONST_INT_P (len_rtx)) { --- 3096,3110 } /* LEN specify length of the block of memcpy/memset operation. !Figure out its range and put it into MIN_SIZE/MAX_SIZE. !In some cases we can make very likely guess on max size, then we !set it into PROBABLE_MAX_SIZE. */ static void determine_block_size (tree len, rtx len_rtx, unsigned HOST_WIDE_INT *min_size, ! unsigned HOST_WIDE_INT *max_size, ! unsigned HOST_WIDE_INT *probable_max_size) { if (CONST_INT_P (len_rtx)) { *** determine_block_size (tree len, rtx len_ *** 3111,3138 else { double_int min, max; ! if (TREE_CODE (len) == SSA_NAME ! && get_range_info (len, &min, &max) == VR_RANGE) { ! if (min.fits_uhwi ()) *min_size = min.to_uhwi (); ! else ! *min_size = 0; ! if (max.fits_uhwi ()) ! *max_size = max.to_uhwi (); ! else ! *max_size = (HOST_WIDE_INT)-1; } ! else { ! if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1)) ! *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1); ! else ! *min_size = 0; ! if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1)) ! *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1); ! else ! *max_size = GET_MODE_MASK (GET_MODE (len_rtx)); } } gcc_checking_assert (*max_size <= --- 3114,3160 else { double_int min, max; ! enum value_range_type range_type = VR_UNDEFINED; ! ! /* Determine bounds from
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
> Ada was using long for time_t and > > type timeval is array (1 .. 2) of C.long > > It assumes that the type of tv_nsec is the same as tv_sec. Yes, and that was indeed wrong/dangerous. > > --- s-osinte-solaris-posix.ads (revision 298928) > > +++ s-osinte-solaris-posix.ads (working copy) > > @@ -513,7 +513,7 @@ > > > > type timespec is record > >tv_sec : time_t; > > - tv_nsec : long; > > + tv_nsec : time_t; > > end record; > > pragma Convention (C, timespec); > > > > But I'd rather settle on which type is appropriate for the tv_nsec > > field before making this change. > > > > __syscall_slong_t is a Linux specific type. We > can add tv_nsec_t, which should be the same > as time_t for all the current targets. Introducing tv_nsec_t looks reasonable to me. Arno
Re: Add value range support into memcpy/memset expansion
On Sun, Nov 17, 2013 at 3:38 PM, Jan Hubicka wrote: > Hi, > this is version I comitted. It also adds a testcase and enables the support > in i386 backend. > > Honza > > * doc/md.texi (setmem, movstr): Update documentation. > * builtins.c (determine_block_size): New function. > (expand_builtin_memcpy): Use it and pass it to > emit_block_move_hints. > (expand_builtin_memset_args): Use it and pass it to > set_storage_via_setmem. > * expr.c (emit_block_move_via_movmem): Add min_size/max_size > parameters; > update call to expander. > (emit_block_move_hints): Add min_size/max_size parameters. > (clear_storage_hints): Likewise. > (set_storage_via_setmem): Likewise. > (clear_storage): Update. > * expr.h (emit_block_move_hints, clear_storage_hints, > set_storage_via_setmem): Update prototype. > * i386.c (ix86_expand_set_or_movmem): Add bounds; export. > (ix86_expand_movmem, ix86_expand_setmem): Remove. > (ix86_expand_movmem, ix86_expand_setmem): Remove. > * i386.md (movmem, setmem): Pass parameters. > > * testsuite/gcc.target/i386/memcpy-2.c: New testcase. The new testcase fails for me on x86 and x86-64: FAIL: gcc.target/i386/memcpy-2.c scan-assembler-not memcpy FAIL: gcc.target/i386/memcpy-2.c (test for excess errors) -- H.J.
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
On Mon, Nov 18, 2013 at 6:17 AM, Arnaud Charlet wrote: >> > >> > type timespec is record >> >tv_sec : time_t; >> > - tv_nsec : long; >> > + tv_nsec : time_t; >> > end record; >> > pragma Convention (C, timespec); >> > >> > But I'd rather settle on which type is appropriate for the tv_nsec >> > field before making this change. >> > >> >> __syscall_slong_t is a Linux specific type. We >> can add tv_nsec_t, which should be the same >> as time_t for all the current targets. > > Introducing tv_nsec_t looks reasonable to me. > Can you make the change? Thanks. -- H.J.
Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec
> >> __syscall_slong_t is a Linux specific type. We > >> can add tv_nsec_t, which should be the same > >> as time_t for all the current targets. > > > > Introducing tv_nsec_t looks reasonable to me. > > > > Can you make the change? > > Thanks. Not right now, I have lots of other things to do. Arno
Re: libsanitizer merge from upstream r191666
On Fri, Nov 15, 2013 at 11:10:18AM +0100, Jakub Jelinek wrote: > 2013-11-15 Jakub Jelinek > > * cfgexpand.c (struct stack_vars_data): Add asan_base and asan_alignb > fields. > (expand_stack_vars): For -fsanitize=address, use (and set initially) > data->asan_base as base for vars and update asan_alignb. > (expand_used_vars): Initialize data.asan_base and data.asan_alignb. > Pass them to asan_emit_stack_protection. > * asan.c (asan_detect_stack_use_after_return): New variable. > (asan_emit_stack_protection): Add pbase and alignb arguments. > Implement use after return sanitization. > * asan.h (asan_emit_stack_protection): Adjust prototype. > (ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define. Here is an updated version of that, that applies cleanly against Honza's latest committed changes (but there are further changes pending AFAIK, so likely yet another ", sz" needed) and increments also the size of the bottom (aka left) red zone if alignment 64 bytes or higher is needed, so that we don't (temporarily??) avoid wasting space and runtime. 2013-11-18 Jakub Jelinek * cfgexpand.c (struct stack_vars_data): Add asan_base and asan_alignb fields. (expand_stack_vars): For -fsanitize=address, use (and set initially) data->asan_base as base for vars and update asan_alignb. (expand_used_vars): Initialize data.asan_base and data.asan_alignb. Pass them to asan_emit_stack_protection. * asan.c (asan_detect_stack_use_after_return): New variable. (asan_emit_stack_protection): Add pbase and alignb arguments. Implement use after return sanitization. * asan.h (asan_emit_stack_protection): Adjust prototype. (ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define. --- gcc/cfgexpand.c.jj 2013-11-18 09:59:02.772253297 +0100 +++ gcc/cfgexpand.c 2013-11-18 15:28:48.770478076 +0100 @@ -879,6 +879,12 @@ struct stack_vars_data /* Vector of partition representative decls in between the paddings. */ vec asan_decl_vec; + + /* Base pseudo register for Address Sanitizer protected automatic vars. */ + rtx asan_base; + + /* Alignment needed for the Address Sanitizer protected automatic vars. */ + unsigned int asan_alignb; }; /* A subroutine of expand_used_vars. Give each partition representative @@ -963,6 +969,7 @@ expand_stack_vars (bool (*pred) (size_t) alignb = stack_vars[i].alignb; if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT) { + base = virtual_stack_vars_rtx; if ((flag_sanitize & SANITIZE_ADDRESS) && pred) { HOST_WIDE_INT prev_offset = frame_offset; @@ -991,10 +998,13 @@ expand_stack_vars (bool (*pred) (size_t) if (repr_decl == NULL_TREE) repr_decl = stack_vars[i].decl; data->asan_decl_vec.safe_push (repr_decl); + data->asan_alignb = MAX (data->asan_alignb, alignb); + if (data->asan_base == NULL) + data->asan_base = gen_reg_rtx (Pmode); + base = data->asan_base; } else offset = alloc_stack_frame_space (stack_vars[i].size, alignb); - base = virtual_stack_vars_rtx; base_align = crtl->max_used_stack_slot_alignment; } else @@ -1781,6 +1791,8 @@ expand_used_vars (void) data.asan_vec = vNULL; data.asan_decl_vec = vNULL; + data.asan_base = NULL_RTX; + data.asan_alignb = 0; /* Reorder decls to be protected by iterating over the variables array multiple times, and allocating out of each phase in turn. */ @@ -1805,16 +1817,29 @@ expand_used_vars (void) if (!data.asan_vec.is_empty ()) { HOST_WIDE_INT prev_offset = frame_offset; - HOST_WIDE_INT offset - = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE, - ASAN_RED_ZONE_SIZE); + HOST_WIDE_INT offset, sz; + sz = ASAN_RED_ZONE_SIZE; + sz = data.asan_vec[0] - prev_offset; + if (data.asan_alignb > ASAN_RED_ZONE_SIZE + && data.asan_alignb <= 4096 + && sz + ASAN_RED_ZONE_SIZE >= data.asan_alignb) + { + sz = ((sz + ASAN_RED_ZONE_SIZE + data.asan_alignb - 1) + & ~(data.asan_alignb - HOST_WIDE_INT_1)) - sz; + } + else + sz = ASAN_RED_ZONE_SIZE; + offset + = alloc_stack_frame_space (sz, ASAN_RED_ZONE_SIZE); data.asan_vec.safe_push (prev_offset); data.asan_vec.safe_push (offset); var_end_seq = asan_emit_stack_protection (virtual_stack_vars_rtx, + data.asan_base, + data.asan_alignb, data.asan_vec.address (), -
Re: [PATCH] Implement -fsanitize=null + new sanopt pass
On Mon, Nov 18, 2013 at 02:51:41PM +0100, Jakub Jelinek wrote: > On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote: > > --- gcc/config/bootstrap-ubsan.mk.mp2013-11-12 13:46:13.345182065 > > +0100 > > +++ gcc/config/bootstrap-ubsan.mk 2013-11-12 13:46:49.812314016 +0100 > > @@ -2,6 +2,6 @@ > > > > STAGE2_CFLAGS += -fsanitize=undefined > > STAGE3_CFLAGS += -fsanitize=undefined > > -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \ > > +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl > > \ > > Hopefully with my pending patch you can remove the -lpthread -ldl again, but > ok for now. Cool. > > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > + { > > + gimple stmt = gsi_stmt (gsi); > > + > > + if (gimple_code (stmt) != GIMPLE_CALL) > > if (is_gimple_call (stmt)) Fixed. > Ok with those changes. Thanks. Also I'll have to add some headers after the gimple.h reorg, but that is an obvious change. Marek
Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk
On 16/11/13 01:20, Aldy Hernandez wrote: On 11/15/13 15:05, Aldy Hernandez wrote: Why all this, and not keep everything but GF_OMP_FOR_KIND_SIMD and GF_OMP_FOR_KIND_DISTRIBUTE as they were, and just use: GF_OMP_FOR_KIND_DISTIRBUTE = 1 << 0, GF_OMP_FOR_KIND_SIMD = 2 << 0, GF_OMP_FOR_KIND_CILKSIMD = 3 << 0, Sounds good. Testing the following patch against my previous patch which had already been committed. If there are no regressions, I will commit. Aldy Committed. Hi Aldy, The testcase c-c++-common/cilk-plus/PS/body.c needs an effective target check for fopenmp before adding -fopenmp to its options, otherwise it'll fail on bare-metal targets like arm-none-eabi. This patch adds that. Ok to commit? Thanks, Kyrill 2013-11-18 Kyrylo Tkachov * c-c++-common/cilk-plus/PS/body.c: Add fopenmp effective target check.diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c index 9b10041..82c0a0c 100644 --- a/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c +++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-fcilkplus -fopenmp" } */ +/* { dg-require-effective-target fopenmp } */ int *a, *b, c; void *jmpbuf[10];
Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
I would agree that the example is just for the case cost model makes correct estimation But how can we assure ourself that it won't have any mistakes in the future? I believe it'll be Ok to introduce an extra flag as Jakub proposed for the dedicated simd-forced vectorization to use unlimited cost model. This can be default for -fopenmp or there should be a warning issued that compiler overrides user's request of vectorization. In such a case user can enforce vectorization (even with mentioned results :) with this unlimited cost model for simd. On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener wrote: > On Fri, 15 Nov 2013, Sergey Ostanevich wrote: > >> Richard, >> >> here's an example that causes trigger for the cost model. > > I hardly believe that (AVX2) > > .L9: > vmovups (%rsi), %xmm3 > addl$1, %r8d > addq$256, %rsi > vinsertf128 $0x1, -240(%rsi), %ymm3, %ymm1 > vmovups -224(%rsi), %xmm3 > vinsertf128 $0x1, -208(%rsi), %ymm3, %ymm3 > vshufps $136, %ymm3, %ymm1, %ymm3 > vperm2f128 $3, %ymm3, %ymm3, %ymm2 > vshufps $68, %ymm2, %ymm3, %ymm1 > vshufps $238, %ymm2, %ymm3, %ymm2 > vmovups -192(%rsi), %xmm3 > vinsertf128 $1, %xmm2, %ymm1, %ymm2 > vinsertf128 $0x1, -176(%rsi), %ymm3, %ymm1 > vmovups -160(%rsi), %xmm3 > vinsertf128 $0x1, -144(%rsi), %ymm3, %ymm3 > vshufps $136, %ymm3, %ymm1, %ymm3 > vperm2f128 $3, %ymm3, %ymm3, %ymm1 > vshufps $68, %ymm1, %ymm3, %ymm4 > vshufps $238, %ymm1, %ymm3, %ymm1 > vmovups -128(%rsi), %xmm3 > vinsertf128 $1, %xmm1, %ymm4, %ymm1 > vshufps $136, %ymm1, %ymm2, %ymm1 > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > vshufps $68, %ymm2, %ymm1, %ymm4 > vshufps $238, %ymm2, %ymm1, %ymm2 > vinsertf128 $0x1, -112(%rsi), %ymm3, %ymm1 > vmovups -96(%rsi), %xmm3 > vinsertf128 $1, %xmm2, %ymm4, %ymm4 > vinsertf128 $0x1, -80(%rsi), %ymm3, %ymm3 > vshufps $136, %ymm3, %ymm1, %ymm3 > vperm2f128 $3, %ymm3, %ymm3, %ymm2 > vshufps $68, %ymm2, %ymm3, %ymm1 > vshufps $238, %ymm2, %ymm3, %ymm2 > vmovups -64(%rsi), %xmm3 > vinsertf128 $1, %xmm2, %ymm1, %ymm2 > vinsertf128 $0x1, -48(%rsi), %ymm3, %ymm1 > vmovups -32(%rsi), %xmm3 > vinsertf128 $0x1, -16(%rsi), %ymm3, %ymm3 > cmpl%r8d, %edi > vshufps $136, %ymm3, %ymm1, %ymm3 > vperm2f128 $3, %ymm3, %ymm3, %ymm1 > vshufps $68, %ymm1, %ymm3, %ymm5 > vshufps $238, %ymm1, %ymm3, %ymm1 > vinsertf128 $1, %xmm1, %ymm5, %ymm1 > vshufps $136, %ymm1, %ymm2, %ymm1 > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > vshufps $68, %ymm2, %ymm1, %ymm3 > vshufps $238, %ymm2, %ymm1, %ymm2 > vinsertf128 $1, %xmm2, %ymm3, %ymm1 > vshufps $136, %ymm1, %ymm4, %ymm1 > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > vshufps $68, %ymm2, %ymm1, %ymm3 > vshufps $238, %ymm2, %ymm1, %ymm2 > vinsertf128 $1, %xmm2, %ymm3, %ymm2 > vaddps %ymm2, %ymm0, %ymm0 > ja .L9 > > is more efficient than > > .L3: > vaddss (%rcx,%rax), %xmm0, %xmm0 > addq$32, %rax > cmpq%rdx, %rax > jne .L3 > > ;) > >> As soon as >> elemental functions will appear and we update the vectorizer so it can accept >> an elemental function inside the loop - we will have the same >> situation as we have >> it now: cost model will bail out with profitability estimation. > > Yes. > >> Still we have no chance to get info on how efficient the bar() function when >> it >> is in vector form. > > Well I assume you mean that the speedup when vectorizing the elemental > will offset whatever wreckage we cause with vectorizing the rest of the > statements. I'd say you can at least compare to unrolling by > the vectorization factor, building the vector inputs to the elemental > from scalars, distributing the vector result from the elemental to > scalars. > >> I believe I should repeat: #pragma omp simd is intended for introduction of >> an >> instruction-level parallel region on developer's request, hence should >> be treated >> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle >> here, not a help. > > Surely not if there isn't an elemental call in it. With it the > cost model of course will have not enough information to decide. > > But still, what's the difference to the case where we cannot vectorize > the function? What happens if we cannot vectorize the elemental? > Do we have to build scalar versions for all possible vector sizes? > > Richard. > >> Regards, >> Sergos >> >> >> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener wrote: >> > Sergey Ostanevich wrote: >> >>this is only for the whole file? I mean to have a particula
Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition
On Sat, Nov 16, 2013 at 12:33 AM, Jan Hubicka wrote: >> When testing with -freorder-blocks-and-partition enabled, I hit a >> verification failure in an LTO profiledbootstrap. Edge forwarding >> performed when we went into cfg layout mode after bb reordering >> (during compgotos) created a situation where a hot block was then >> dominated by a cold block and was therefore remarked as cold. Because >> bb reorder was complete at that point, it was not moved in the >> physical layout, and we incorrectly went in and out of the cold >> section multiple times. >> >> The following patch addresses that by fixing the layout when we move >> blocks to the cold section after bb reordering is complete. >> >> Tested with an LTO profiledbootstrap with >> -freorder-blocks-and-partition enabled. Ok for trunk? >> >> Thanks, >> Teresa >> >> 2013-11-15 Teresa Johnson >> >> * cfgrtl.c (fixup_partitions): Reorder blocks if necessary. > > computed_gotos just unfactors unified blocks that we use to avoid CFGs with > O(n^2) edges. This is mostly to avoid problems with nonlinearity of other > passes > and to reduce the quadratic memory use case to one function at a time. > > I wonder if it won't be cleaner to simply unfactor those just before > pass_reorder_blocks. > > Computed gotos are used e.g. in libjava interpreter to optimize the tight > interpretting > loop. I think those cases would benefit from having at least > scheduling/reordering > and alignments done right. > > Of course it depends on how bad the compile time implications are (I think in > addition > to libjava, there was a lucier's testcase that made us to go for this trick) , > but I would prefer it over ading yet another hack into cfgrtl... > We also may just avoid cfglayout cleanup_cfg while doing computed gotos... I am testing a new patch right now that simply moves compgotos to just before bb reordering, and ads an assert to cfg_layout_initialize to detect when we attempt that after bb reordering. I looked at the other places that go into cfg layout and I think compgotos is currently the only one after bb reordering. >From a bb layout perspective it seems like it would be beneficial to do compgotos before layout. Was the current position just to try to reduce compile time by keeping the block unified as long as possible? For your comment "I think those cases would benefit from having at least scheduling/reordering and alignments done right." do you mean that it would be better from a code quality perspective for those to have compgotos done earlier before those passes? That seems to make sense to me as well. I'm doing an lto profiledbootstrap with the change right now. Is there other testing that should be done for this change? Can you point me at lucier's testcase that you refer to above? I found that PR15242 was the bug that inspired adding compgotos, but it is a small test case so I'm not sure what I will learn from trying that other than whether compgotos still kicks in as expected. Thanks, Teresa > > Honza -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: Use anti-ranges in memcpy/memset expansion
On Mon, 18 Nov 2013, Jan Hubicka wrote: > Hi, > this patch adds support to anti-ranges to determine_block_size. > This makes it possible to bound size of block from bellow that is > useful to avoid need for small size code path and it also allows > us to inline the following: > void *a; > void *b; > t(int c) > { > if (c<10) > memcpy (a,b,c); > } > > Because C is signed, we do not really know that the block is smaller > than 10, but it is most likely the case. > > Bootstrapped/regtested x86_64-linux OK? Ok. Thanks, Richard. > Honza > > * md.texi (setmem): Document new parameter. > * optabs.c (maybe_gen_insn): Support 9 operands. > * builtins.c (determine_block_size): Add probable_max_size; > support anti-ranges. > (expand_builtin_memcpy. expand_builtin_memset_args): Pass around > probable_max_size. > * expr.c (emit_block_move_via_movmem, emit_block_move_hints, > emit_block_move, clear_storage_hints, set_storage_via_setmem): > Likewise. > * expr.h (emit_block_move_hints, clear_storage_hints, > set_storage_via_setmem): Update prototype. > * i386.md (setmem, movmem patterns): Add 9th operand. > * i386-protos.h (ix86_expand_set_or_movmem): Update prototype. > * i386.c (ix86_expand_set_or_movmem): Take probable_max_size_exp > argument; pass it to decide_alg. > > * gcc.target/i386/memcpy-3.c: New testcase. > Index: doc/md.texi > === > *** doc/md.texi (revision 204945) > --- doc/md.texi (working copy) > *** all cases. This expected alignment is al > *** 5352,5357 > --- 5352,5359 > Expected size, when unknown, is set to @code{(const_int -1)}. > Operand 7 is the minimal size of the block and operand 8 is the > maximal size of the block (NULL if it can not be represented as CONST_INT). > + Operand 9 is the probable maximal size (i.e. we can not rely on it for > correctness, > + but it can be used for choosing proper code sequence for a given size). > > The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}. > > Index: optabs.c > === > *** optabs.c (revision 204945) > --- optabs.c (working copy) > *** maybe_gen_insn (enum insn_code icode, un > *** 8229,8234 > --- 8229,8238 > return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value, > ops[3].value, ops[4].value, ops[5].value, > ops[6].value, ops[7].value); > + case 9: > + return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value, > + ops[3].value, ops[4].value, ops[5].value, > + ops[6].value, ops[7].value, ops[8].value); > } > gcc_unreachable (); > } > Index: builtins.c > === > *** builtins.c(revision 204945) > --- builtins.c(working copy) > *** builtin_memcpy_read_str (void *data, HOS > *** 3096,3107 > } > > /* LEN specify length of the block of memcpy/memset operation. > !Figure out its range and put it into MIN_SIZE/MAX_SIZE. */ > > static void > determine_block_size (tree len, rtx len_rtx, > unsigned HOST_WIDE_INT *min_size, > ! unsigned HOST_WIDE_INT *max_size) > { > if (CONST_INT_P (len_rtx)) > { > --- 3096,3110 > } > > /* LEN specify length of the block of memcpy/memset operation. > !Figure out its range and put it into MIN_SIZE/MAX_SIZE. > !In some cases we can make very likely guess on max size, then we > !set it into PROBABLE_MAX_SIZE. */ > > static void > determine_block_size (tree len, rtx len_rtx, > unsigned HOST_WIDE_INT *min_size, > ! unsigned HOST_WIDE_INT *max_size, > ! unsigned HOST_WIDE_INT *probable_max_size) > { > if (CONST_INT_P (len_rtx)) > { > *** determine_block_size (tree len, rtx len_ > *** 3111,3138 > else > { > double_int min, max; > ! if (TREE_CODE (len) == SSA_NAME > ! && get_range_info (len, &min, &max) == VR_RANGE) > { > ! if (min.fits_uhwi ()) > *min_size = min.to_uhwi (); > ! else > ! *min_size = 0; > ! if (max.fits_uhwi ()) > ! *max_size = max.to_uhwi (); > ! else > ! *max_size = (HOST_WIDE_INT)-1; > } > ! else > { > ! if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1)) > ! *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1); > ! else > ! *min_size = 0; > ! if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1)) > ! *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1); > ! else > ! *max_siz
Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
On Mon, 18 Nov 2013, Sergey Ostanevich wrote: > I would agree that the example is just for the case cost model makes > correct estimation But how can we assure ourself that it won't have any > mistakes in the future? We call it bugs and not mistakes and we have bugzilla for it. Richard. > I believe it'll be Ok to introduce an extra flag as Jakub proposed for the > dedicated simd-forced vectorization to use unlimited cost model. This > can be default for -fopenmp or there should be a warning issued that > compiler overrides user's request of vectorization. In such a case user > can enforce vectorization (even with mentioned results :) with this > unlimited cost model for simd. > > > > On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener wrote: > > On Fri, 15 Nov 2013, Sergey Ostanevich wrote: > > > >> Richard, > >> > >> here's an example that causes trigger for the cost model. > > > > I hardly believe that (AVX2) > > > > .L9: > > vmovups (%rsi), %xmm3 > > addl$1, %r8d > > addq$256, %rsi > > vinsertf128 $0x1, -240(%rsi), %ymm3, %ymm1 > > vmovups -224(%rsi), %xmm3 > > vinsertf128 $0x1, -208(%rsi), %ymm3, %ymm3 > > vshufps $136, %ymm3, %ymm1, %ymm3 > > vperm2f128 $3, %ymm3, %ymm3, %ymm2 > > vshufps $68, %ymm2, %ymm3, %ymm1 > > vshufps $238, %ymm2, %ymm3, %ymm2 > > vmovups -192(%rsi), %xmm3 > > vinsertf128 $1, %xmm2, %ymm1, %ymm2 > > vinsertf128 $0x1, -176(%rsi), %ymm3, %ymm1 > > vmovups -160(%rsi), %xmm3 > > vinsertf128 $0x1, -144(%rsi), %ymm3, %ymm3 > > vshufps $136, %ymm3, %ymm1, %ymm3 > > vperm2f128 $3, %ymm3, %ymm3, %ymm1 > > vshufps $68, %ymm1, %ymm3, %ymm4 > > vshufps $238, %ymm1, %ymm3, %ymm1 > > vmovups -128(%rsi), %xmm3 > > vinsertf128 $1, %xmm1, %ymm4, %ymm1 > > vshufps $136, %ymm1, %ymm2, %ymm1 > > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > > vshufps $68, %ymm2, %ymm1, %ymm4 > > vshufps $238, %ymm2, %ymm1, %ymm2 > > vinsertf128 $0x1, -112(%rsi), %ymm3, %ymm1 > > vmovups -96(%rsi), %xmm3 > > vinsertf128 $1, %xmm2, %ymm4, %ymm4 > > vinsertf128 $0x1, -80(%rsi), %ymm3, %ymm3 > > vshufps $136, %ymm3, %ymm1, %ymm3 > > vperm2f128 $3, %ymm3, %ymm3, %ymm2 > > vshufps $68, %ymm2, %ymm3, %ymm1 > > vshufps $238, %ymm2, %ymm3, %ymm2 > > vmovups -64(%rsi), %xmm3 > > vinsertf128 $1, %xmm2, %ymm1, %ymm2 > > vinsertf128 $0x1, -48(%rsi), %ymm3, %ymm1 > > vmovups -32(%rsi), %xmm3 > > vinsertf128 $0x1, -16(%rsi), %ymm3, %ymm3 > > cmpl%r8d, %edi > > vshufps $136, %ymm3, %ymm1, %ymm3 > > vperm2f128 $3, %ymm3, %ymm3, %ymm1 > > vshufps $68, %ymm1, %ymm3, %ymm5 > > vshufps $238, %ymm1, %ymm3, %ymm1 > > vinsertf128 $1, %xmm1, %ymm5, %ymm1 > > vshufps $136, %ymm1, %ymm2, %ymm1 > > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > > vshufps $68, %ymm2, %ymm1, %ymm3 > > vshufps $238, %ymm2, %ymm1, %ymm2 > > vinsertf128 $1, %xmm2, %ymm3, %ymm1 > > vshufps $136, %ymm1, %ymm4, %ymm1 > > vperm2f128 $3, %ymm1, %ymm1, %ymm2 > > vshufps $68, %ymm2, %ymm1, %ymm3 > > vshufps $238, %ymm2, %ymm1, %ymm2 > > vinsertf128 $1, %xmm2, %ymm3, %ymm2 > > vaddps %ymm2, %ymm0, %ymm0 > > ja .L9 > > > > is more efficient than > > > > .L3: > > vaddss (%rcx,%rax), %xmm0, %xmm0 > > addq$32, %rax > > cmpq%rdx, %rax > > jne .L3 > > > > ;) > > > >> As soon as > >> elemental functions will appear and we update the vectorizer so it can > >> accept > >> an elemental function inside the loop - we will have the same > >> situation as we have > >> it now: cost model will bail out with profitability estimation. > > > > Yes. > > > >> Still we have no chance to get info on how efficient the bar() function > >> when it > >> is in vector form. > > > > Well I assume you mean that the speedup when vectorizing the elemental > > will offset whatever wreckage we cause with vectorizing the rest of the > > statements. I'd say you can at least compare to unrolling by > > the vectorization factor, building the vector inputs to the elemental > > from scalars, distributing the vector result from the elemental to > > scalars. > > > >> I believe I should repeat: #pragma omp simd is intended for introduction > >> of an > >> instruction-level parallel region on developer's request, hence should > >> be treated > >> in same manner as #pragma omp parallel. Vectorizer cost model is an > >> obstacle > >> here, not a help. > > > > Surely not if there isn't an elemental call in it. With it the > > cost model of course will have not enough information to decide. > > > > But still, what's the difference to
Minor cgraph cleanups
This fixes some minor ugliness I noticed while looking at this code. It renames a poorly named global variable (shadowed by some function args in the same file), replaces a cast of an integer value to a pointer, and removes an undocumented and unused return value. Bootstrapped and tested on x86_64-linux, ok? Bernd * cgraphunit.c (symtab_terminator): New variable. (queued_nodes): Renamed from first. Use symtab_terminator as initializer. (analyze_functions): Adjust accordingly. (cgraph_process_new_functions): Return void. * cgraph.h (cgraph_process_new_functions): Adjust declaration. diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 1ac6dfb..0d59a7a 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -737,7 +737,7 @@ void cgraph_finalize_function (tree, bool); void finalize_compilation_unit (void); void compile (void); void init_cgraph (void); -bool cgraph_process_new_functions (void); +void cgraph_process_new_functions (void); void cgraph_process_same_body_aliases (void); void fixup_same_cpp_alias_visibility (symtab_node *, symtab_node *target, tree); /* Initialize datastructures so DECL is a function in lowered gimple form. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 8ab274b..50bc7d3 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -264,11 +264,13 @@ decide_is_symbol_needed (symtab_node *node) return false; } -/* Head of the queue of nodes to be processed while building callgraph */ +/* Head and terminator of the queue of nodes to be processed while building + callgraph. */ -static symtab_node *first = (symtab_node *)(void *)1; +static symtab_node symtab_terminator; +static symtab_node *queued_nodes = &symtab_terminator; -/* Add NODE to queue starting at FIRST. +/* Add NODE to queue starting at QUEUED_NODES. The queue is linked via AUX pointers and terminated by pointer to 1. */ static void @@ -276,25 +278,24 @@ enqueue_node (symtab_node *node) { if (node->aux) return; - gcc_checking_assert (first); - node->aux = first; - first = node; + gcc_checking_assert (queued_nodes); + node->aux = queued_nodes; + queued_nodes = node; } /* Process CGRAPH_NEW_FUNCTIONS and perform actions necessary to add these functions into callgraph in a way so they look like ordinary reachable functions inserted into callgraph already at construction time. */ -bool +void cgraph_process_new_functions (void) { - bool output = false; tree fndecl; struct cgraph_node *node; cgraph_node_set_iterator csi; if (!cgraph_new_nodes) -return false; +return; handle_alias_pairs (); /* Note that this queue may grow as its being processed, as the new functions may generate new ones. */ @@ -309,7 +310,6 @@ cgraph_process_new_functions (void) it into reachable functions list. */ cgraph_finalize_function (fndecl, false); - output = true; cgraph_call_function_insertion_hooks (node); enqueue_node (node); break; @@ -350,7 +350,6 @@ cgraph_process_new_functions (void) } free_cgraph_node_set (cgraph_new_nodes); cgraph_new_nodes = NULL; - return output; } /* As an GCC extension we allow redefinition of the function. The @@ -980,11 +979,11 @@ analyze_functions (void) /* Lower representation, build callgraph edges and references for all trivially needed symbols and all symbols referred by them. */ - while (first != (symtab_node *)(void *)1) + while (queued_nodes != &symtab_terminator) { changed = true; - node = first; - first = (symtab_node *)first->aux; + node = queued_nodes; + queued_nodes = (symtab_node *)queued_nodes->aux; cgraph_node *cnode = dyn_cast (node); if (cnode && cnode->definition) {
[C++ testcase, committed] PR 53473
Hi, tested x86_64-linux, committed to mainline. Paolo. / 2013-11-18 Paolo Carlini PR c++/53473 * g++.dg/cpp0x/constexpr-noexcept7.C: New. Index: g++.dg/cpp0x/constexpr-noexcept7.C === --- g++.dg/cpp0x/constexpr-noexcept7.C (revision 0) +++ g++.dg/cpp0x/constexpr-noexcept7.C (working copy) @@ -0,0 +1,9 @@ +// PR c++/53473 +// { dg-do compile { target c++11 } } + +template struct A +{ + static constexpr T foo() noexcept { return 0; } +}; + +template<> constexpr int A::foo() noexcept { return 0; }
Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition
On Mon, Nov 18, 2013 at 3:53 PM, Teresa Johnson wrote: > From a bb layout perspective it seems like it would be beneficial to > do compgotos before layout. Was the current position just to try to > reduce compile time by keeping the block unified as long as possible? It was more a hack that got out of hand. Apparently it hurt some interpreters (branch prediction!) when the unified computed goto is not "unfactored". There was a PR for this, and the unfactoring code I added only fixed part of the problem. > For your comment "I think those cases would benefit from having at > least scheduling/reordering and alignments done right." do you mean > that it would be better from a code quality perspective for those to > have compgotos done earlier before those passes? That seems to make > sense to me as well. Theoretically: Yes, perhaps. In practice there isn't much to gain. Unfactoring before bb-reorder is probably the most helpful thing, to get better results for basic block alignment and placement. But scheduling punts on computed gotos (or explodes in some non-linear bottleneck). What used to help is profile-based branch speculation, i.e. if (*target_addr == most_frequent_target_addr) goto most_frequent_target_add; else goto *target_addr; But I'm not sure if our value profiling based optimizations still have this case. > I'm doing an lto profiledbootstrap with the change right now. Is there > other testing that should be done for this change? Can you point me at > lucier's testcase that you refer to above? I found that PR15242 was > the bug that inspired adding compgotos, but it is a small test case so > I'm not sure what I will learn from trying that other than whether > compgotos still kicks in as expected. ISTR it's http://gcc.gnu.org/PR26854 Ciao! Steven
[PATCH, i386, MPX, 2/X] Pointers Checker [23/25] Target hooks
Hi, Here is a patch with i386 version of Pointer Bounds Checker hooks. Thanks, Ilya -- 2013-11-15 Ilya Enkovich * config/i386/i386.c (ix86_builtin_mpx_function): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_fn_abi_va_list_bounds_size): New. (ix86_mpx_bound_mode): New. (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 6ddd37a..50355f1 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -34044,6 +34044,162 @@ addcarryx: gcc_unreachable (); } +/* Return function decl for target specific builtin + for given MPX builtin passed i FCODE. */ +static tree +ix86_builtin_mpx_function (unsigned fcode) +{ + switch (fcode) +{ +case BUILT_IN_CHKP_BNDMK: + return ix86_builtins[IX86_BUILTIN_BNDMK]; + +case BUILT_IN_CHKP_BNDSTX: + return ix86_builtins[IX86_BUILTIN_BNDSTX]; + +case BUILT_IN_CHKP_BNDLDX: + return ix86_builtins[IX86_BUILTIN_BNDLDX]; + +case BUILT_IN_CHKP_BNDCL: + return ix86_builtins[IX86_BUILTIN_BNDCL]; + +case BUILT_IN_CHKP_BNDCU: + return ix86_builtins[IX86_BUILTIN_BNDCU]; + +case BUILT_IN_CHKP_BNDRET: + return ix86_builtins[IX86_BUILTIN_BNDRET]; + +case BUILT_IN_CHKP_INTERSECT: + return ix86_builtins[IX86_BUILTIN_BNDINT]; + +case BUILT_IN_CHKP_SET_PTR_BOUNDS: + return ix86_builtins[IX86_BUILTIN_BNDSET]; + +case BUILT_IN_CHKP_NARROW: + return ix86_builtins[IX86_BUILTIN_BNDNARROW]; + +case BUILT_IN_CHKP_ARG_BND: + return ix86_builtins[IX86_BUILTIN_ARG_BND]; + +case BUILT_IN_CHKP_SIZEOF: + return ix86_builtins[IX86_BUILTIN_SIZEOF]; + +case BUILT_IN_CHKP_EXTRACT_LOWER: + return ix86_builtins[IX86_BUILTIN_BNDLOWER]; + +case BUILT_IN_CHKP_EXTRACT_UPPER: + return ix86_builtins[IX86_BUILTIN_BNDUPPER]; + +default: + return NULL_TREE; +} + + gcc_unreachable (); +} + +/* Load bounds PTR pointer value loaded from SLOT. + if SLOT is a register then load bounds associated + with special address identified by BND. + + Return loaded bounds. */ +static rtx +ix86_load_bounds (rtx slot, rtx ptr, rtx bnd) +{ + rtx addr = NULL; + rtx reg; + + if (REG_P (slot)) +{ + ptr = slot; + + /* We do not expect non register bounds for register +parameters other than R8 and R9. */ + gcc_assert (REGNO (ptr) == R8_REG || REGNO (ptr) == R9_REG); + gcc_assert (bnd == const1_rtx || bnd == const2_rtx); + + /* Here we have the case when more than five pointers are +passed on registers. In this case we are out of bound +registers and have to use bndldx to load bound. RA and +RA - 8 are used for address translation in bndldx. */ + if (bnd == const1_rtx) + addr = plus_constant (Pmode, arg_pointer_rtx, -8); + else + addr = plus_constant (Pmode, arg_pointer_rtx, -16); +} + else if (MEM_P (slot)) +{ + if (!ptr) + ptr = copy_to_mode_reg (Pmode, slot); + addr = XEXP (slot, 0); + addr = force_reg (Pmode, addr); +} + else +gcc_unreachable (); + + ptr = force_reg (Pmode, ptr); + /* If ptr was a register originally then it may have + mode other than Pmode. We need to extend in such + case because bndldx may work only with Pmode regs. */ + if (GET_MODE (ptr) != Pmode) +{ + rtx ext = gen_rtx_ZERO_EXTEND (Pmode, ptr); + ptr = gen_reg_rtx (Pmode); + emit_move_insn (ptr, ext); +} + + reg = gen_reg_rtx (BNDmode); + emit_insn (TARGET_64BIT +? gen_bnd64_ldx (reg, addr, ptr) +: gen_bnd32_ldx (reg, addr, ptr)); + + return reg; +} + +/* Store bounds BOUNDS for PTR pointer value stored in + specified ADDR. If ADDR is a register then TO identifies + which special address to use for bounds store. */ +static void +ix86_store_bounds (rtx ptr, rtx addr, rtx bounds, rtx to) +{ + if (REG_P (addr)) +{ + /* Non register bounds comes only for parameters in +R8 and R9. */ + gcc_assert (REGNO (addr) == R8_REG || REGNO (addr) == R9_REG); + gcc_assert (to == const1_rtx || to == const2_rtx); + + if (to == const1_rtx) + addr = plus_constant (Pmode, stack_pointer_rtx, -8); + else + addr = plus_constant (Pmode, stack_pointer_rtx, -16); +} + else if (MEM_P (addr)) +addr = XEXP (addr, 0); + else +gcc_unreachable (); + + /* Should we also ignore integer modes of incorrect size?. */ + ptr = force_reg (Pmode, ptr); + addr = force_reg (Pmode, addr); + + /* Avoid registers which connot be used as index. */ + if (REGNO (ptr) == VIRTUAL_INCOMING_ARGS_REGNUM + || REGNO (ptr) == VIRTUAL_STACK_VARS_REGNUM + || REGNO (ptr) == VIRTUAL_OUTGOING_ARGS_REGNUM) +{
Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk
The testcase c-c++-common/cilk-plus/PS/body.c needs an effective target check for fopenmp before adding -fopenmp to its options, otherwise it'll fail on bare-metal targets like arm-none-eabi. This patch adds that. Ok to commit? Thanks, Kyrill 2013-11-18 Kyrylo Tkachov * c-c++-common/cilk-plus/PS/body.c: Add fopenmp effective target check. You can commit this as obvious. Thanks.
Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer
(resending in plain text) On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov wrote: Hi Jakub, Unfortunately, recently there were a few enhancements to sanitizer_symbolizer_posix_libcdep.cc (and friends), in LLVM trunk, and now it looks different from gcc version (apparently, the changes were committed after the merge to gcc happened, I should have pointed this out earlier, sorry). Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise this patch can't go in ASan runtime in gcc - the code will significantly diverge. > On Mon, Nov 18, 2013 at 5:39 PM, Jakub Jelinek wrote: >> >> Hi! >> >> This patch adds libbacktrace symbolizer to libsanitizer, with which we >> can avoid spawning and running external an external program (especially >> when >> it is not even provided by gcc itself). >> >> I've kept the possibility to override the symbolizer by magic symbols >> (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc. >> works and what is the possibility there to add libbacktrace, this just >> requires users to compile with SANITIZE_LIBBACKTRACE defined to signal >> that backtrace-supported.h and backtrace.h is available and the >> sanitizer_symbolizer_posix_libcdep.cc source then decides based on >> backtrace-supported.h etc. whether it is usable. >> >> make check RUNTESTFLAGS='asan.exp ubsan.exp' passes with this. >> >> Some pending issues on the libbacktrace side: >> 1) right now libbacktrace performs dl_iterate_phdr only the first time >>backtrace_pcinfo or backtrace_syminfo is called, if there are some >>dlopens/dlcloses in between that and another querying of the >> symbolizer, >>it won't notice that. Perhaps it can be done only when we don't >>find a symbol and/or have some function that tries to dl_iterare_phdr >>again, look at cached st_ino/st_mtime or similar, and for threaded >>version likely just keep old records, just add a flag to them that they >>should be ignored (or say atomically decrease symbol count to zero >>and something similar for debug info). >> 2) for tsan querying of data symbols, apparently the classes want to see >>not just the symbol name and start value, but also size. libbacktrace >>has all this info available, just doesn't pass it down to the callback. >>I wonder if we'd need to create yet another libbacktrace entrypoint, or >>if it would be acceptable to do source code incompatible, ABI (at least >>on all sane targets) compatible version of just adding another >>uintptr_t symsize argument to backtrace_syminfo_callback. >> 3) I wonder if libbacktrace couldn't be updated to use __atomic_* >> builtins, >>then it could avoid the ugliness to emulate atomic loads and stores. >> >> As for sanitizer, the reason I haven't implemented SendCommand method for >> the libbacktrace symbolizer is that the library doesn't provide the info >> as text, but as individual values passed to the callback, so printing >> that to text and then parsing the text would be very ugly. libbacktrace >> doesn't even need the module names and module offsets, > > If libbacktrace does its own call of dl_iterate_phdr, then it doesn't need many pieces of existing Symbolizer (or POSIXSymbolizer, in new version of source code) complexity - like FindModuleForAddress() function. Consider creating a separate wrapper class for libbacktrace functionality and returning it from Symbolizer::PlatformInit factory (in the new version of code) instead of POSIXSymbolizer. > >> >> so supposedly >> we would need that only if libbacktrace failed to get accurate inline/call >> or symbol info. While the classes have both symbol name/file/line etc. >> and module name/offset fields, apparently the latter are unused if the >> former is filled in. >> >> 2013-11-18 Jakub Jelinek >> >> PR sanitizer/59136 >> * configure.ac: Don't add target-libbacktrace to noconfigdirs >> just because go hasn't been enabled if target-libsanitizer isn't >> in >> noconfigdirs. >> * Makefile.def: Add configure-target-libsanitizer dependency on >> configure-target-libbacktrace and all-target-libsanitizer >> dependency >> on configure-target-libsanitizer. >> * configure: Regenerated. >> libsanitizer/ >> * sanitizer_common/Makefile.am (AM_CPPFLAGS): If >> SANITIZER_LIBBACKTRACE, append -I for libbacktrace headers and >> -DSANITIZER_LIBBACKTRACE. >> * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Add >> libbacktrace symbolizer. >> * tsan/Makefile.am (libtsan_la_LIBADD): Add libbacktrace.la if >> SANITIZER_LIBBACKTRACE. >> * asan/Makefile.am (libasan_la_LIBADD): Likewise. >> * ubsan/Makefile.am (libubsan_la_LIBADD): Likewise. >> * configure.ac (SANITIZER_LIBBACKTRACE): New AM_CONDITIONAL. >> * sanitizer_common/Makefile.in: Regenerated. >> * tsan/Makefile.in: Regenrated. >> * a
Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p
On 18/11/13 09:14, Kyrill Tkachov wrote: On 18/11/13 08:37, Yvan Roux wrote: So, the validation is ok with this patch, I'm just not able to say if the original performance issue is still fixed with it. Could you check it Kyrylo ? Hi Yvan, I'll run the benchmark today to confirm the performance, yep, all good performance-wise :) Thanks, Kyrill but from compiling some code sequences that exhibited the bad behaviour in the past, I see that this patch still fixes the issues. store_minmaxsi is not generated when optimising for speed. Thanks, Kyrill
Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer
On Mon, Nov 18, 2013 at 07:49:03PM +0400, Alexey Samsonov wrote: > > As for sanitizer, the reason I haven't implemented SendCommand method for > > the libbacktrace symbolizer is that the library doesn't provide the info > > as text, but as individual values passed to the callback, so printing > > that to text and then parsing the text would be very ugly. libbacktrace > > doesn't even need the module names and module offsets, > > > If libbacktrace does its own call of dl_iterate_phdr, then it doesn't need > many It does. > pieces of existing Symbolizer (or POSIXSymbolizer, in new version of source > code) > complexity - like FindModuleForAddress() function. Consider creating a > separate Well, as fallback it still needs that. libbacktrace APIs won't tell you: "ok, see, I don't know exactly what this symbol is, but it is in DSO libfoobar.so plus offset 0x123456". It tells you either function name and file/line (for backtrace_pcinfo callback, possibly several times to iterate through inline frames) or symbol name and symbol value (hopefully in the future also symbol size) for backtrace_syminfo callback. If the address isn't found in debug info resp. symbol tables, you just get NULL function name of symbol value, NULL filename and 0 line number or symbol value. Initially I had the code as a different toplevel symbolizer, but for the case where libbacktrace isn't there, or where it isn't usable (due to host limitations e.g. not thread safe, or using malloc, etc.) there needs to be a fallback to the POSIXSymbolizer. Jakub
Re: [PATCH] S/390: More htm testcases plus some fixes
On Mon, 2013-11-18 at 10:05 +0100, Andreas Krebbel wrote: > With the patch the htm-nofloat-2 testcase fails. Due to the > "returns_twice" flag on tbegin the optimizers fail to fold the > compares of the condition code and the s390_optimize_nonescaping_tx > routine in turn fails to optimize the simple transactions. This will > hopefully be fixed with a follow-on patch. Hi Andreas, I assume you're using the returns_twice attribute on your tbegin builtin so that the compiler will help you with the handling of the floating point registers since they are not restored on the s390's transaction failure? We don't have that attribute set on POWER's tbegin builtin and I don't think we should since all of our registers are restored on a transaction failure, but I'd like to know if you added that attribute for any other reason such that POWER should have it too? Peter
Re: [PATCH] Implement -fsanitize=null + new sanopt pass
On Mon, Nov 18, 2013 at 02:52:34PM +0100, Jakub Jelinek wrote: > Oh, one more thing, please update gcc/doc/, the -fsanitize= description is > far from up to date there. Ok, the following (incremental) hopefully improves the docs. Joseph, would you mind having a look at this? Thanks, 2013-11-18 Marek Polacek * doc/invoke.texi: Extend -fsanitize=undefined documentation. --- gcc/doc/invoke.texi.mp3 2013-11-18 15:57:47.104103101 +0100 +++ gcc/doc/invoke.texi 2013-11-18 17:08:51.305594441 +0100 @@ -5260,9 +5260,45 @@ data race bugs. See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for more details. @item -fsanitize=undefined -Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector +Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector. Various computations will be instrumented to detect undefined behavior -at runtime, e.g.@: division by zero or various overflows. +at runtime. Current suboptions are: + +@itemize @bullet + +@item @option{-fsanitize=shift} + +This option enables checking that the result of a shift operation is +not undefined. Note that what exactly is considered undefined differs +slightly between C and C++, as well as between ANSI C and C99, etc. + +@item @option{-fsanitize=integer-divide-by-zero} + +Detect integer division by zero as well as @code{INT_MIN / -1} division. +Note that the latter is only made undefined from C99 onwards. + +@item @option{-fsanitize=unreachable} + +With this option, the compiler will turn the @code{__builtin_unreachable} +call into a diagnostics message call instead. When reaching the +@code{__builtin_unreachable} call, the behavior is undefined. + +@item @option{-fsanitize=vla-bound} + +This option instructs the compiler to check that the size of a variable +length array is positive. This option does not have any effect in +@option{-std=c++1y} mode, as the standard requires the exception be thrown +instead. + +@item @option{-fsanitize=null} + +This option enables pointer checking. Particularly, the application +built with this option turned on will issue an error message when it +tries to dereference a NULL pointer, or if a reference (possibly an +rvalue reference) is bound to a NULL pointer. + +@end itemize + While @option{-ftrapv} causes traps for signed overflows to be emitted, @option{-fsanitize=undefined} gives a diagnostic message. This currently works only for the C family of languages. Marek
[WWWDOCS] Document IPA/LTO/FDO/i386 changes in GCC-4.9
Hi, there was many changes in this area. The following are ones I can think of. Please fell free to suggest more changes. We probably should mention Teresa's splitting work once it is complete and new micro-architectures targetd by x86 backend. Honza Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.36 diff -u -r1.36 changes.html --- changes.html15 Nov 2013 15:40:00 - 1.36 +++ changes.html18 Nov 2013 16:15:32 - @@ -37,14 +37,52 @@ AddressSanitizer, a fast memory error detector, is now available on ARM. - - UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector, has been added and can be enabled via -fsanitize=undefined. Various computations will be instrumented to detect undefined behavior at runtime. UndefinedBehaviorSanitizer is currently available for the C and C++ languages. +Link-time optimization (LTO) improvements: + + Type merging was rewritten. New implementation is significantly faster + and use less memory. + Better partitioning algorithm resulting in less streaming during + link-time. + Early removal of virtual methods reduce size of object files and + improve link-time memory usage and compile time. + Functions are no longer pointlessly renamed. + Function bodies are now loaded on-demand and released early improving + overall memory usage at link-time. + C++ hidden keyed methods can now be optimized out. + +Memory usage of Firefox build with debug enabled was reduced from 15GB to +3.5GB. Link time from 1700 seconds to 350 seconds. + +Inter-procedural optimization improvements: + + New type inheritance analysis module improving devirtualization. + Devirtualization now take into account anonymous name-spaces and the + C++11 final keyword. + New speculative devirtualization pass (controlled by + -fdevirtualize-speculatively. + Calls that was speculatively made direct are turned back to indirect + when doing so does not bring any noticeable benefits. + Local aliases are introduced for symbols that are known to be + semantically equivalent across shared libraries improving dynamic + linking times. + +Feedback directed optimization improvements: + + Profiling of programs using C++ inline functions is now more reliable. + New time profiling determine typical order in which functions are executed. + New function reordering pass (controlled by + -freorder-functions) significantly reduces + startup time of large applications. Until binutils support is + completed, it is effective only with link time optimization. + Feedback driven indirect call removal and devirtualization now handle + cross-module calls when link-time optimization is enabled. + New Languages and Language specific improvements @@ -325,9 +363,20 @@ href="http://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html"; >Function Multiversioning. - GCC now supports the new Intel microarchitecture named Silvermont +GCC now supports the new Intel microarchitecture named Silvermont through -march=slm. +-march=generic has been retuned for better support of + Intel core and AMD Bulldozer architectures. Performance of AMD K7, K8, + Intel Pentium-M, and Pentium4 based CPUs is no longer considered important + for generic. + +Better inlining of memcpy and memset + that is avare of value ranges and produce shorter alignment prologues. + +-mno-accumulate-outgoing-args is now honored when unwind + information is output. Argument accumulation is also now turned off + for portions of program optimized for size. NDS32
Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer
On Mon, Nov 18, 2013 at 07:50:33PM +0400, Alexey Samsonov wrote: > On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov wrote: > Unfortunately, recently there were a few enhancements to > sanitizer_symbolizer_posix_libcdep.cc (and friends), > in LLVM trunk, and now it looks different from gcc version (apparently, the > changes were committed > after the merge to gcc happened, I should have pointed this out earlier, > sorry). > > Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise > this patch can't go in ASan runtime > in gcc - the code will significantly diverge. Here is an (untested) forward port of the patch to what is in llvm svn right now. The unpatched code always had either internal_symbolizer_ != NULL, or external_symbolizer_ != NULL, or both NULL, but not both set. The patch just adds a third variant, libbacktrace_symbolizer_, again, at most one of the 3 will be non-NULL, and the priorities are that internal_symbolizer_ has highest priority, then libbacktrace (if available/usable), then the external one. --- sanitizer_symbolizer_posix_libcdep.cc.jj2013-11-12 19:35:30.0 +0100 +++ sanitizer_symbolizer_posix_libcdep.cc 2013-11-18 17:16:03.202643957 +0100 @@ -27,6 +27,16 @@ #include #include +#ifdef SANITIZER_LIBBACKTRACE +#include "backtrace-supported.h" +#if SANITIZER_LINUX && BACKTRACE_SUPPORTED \ +&& !BACKTRACE_USES_MALLOC && BACKTRACE_SUPPORTS_THREADS +#include "backtrace.h" +#else +#undef SANITIZER_LIBBACKTRACE +#endif +#endif + // C++ demangling function, as required by Itanium C++ ABI. This is weak, // because we do not require a C++ ABI library to be linked to a program // using sanitizers; if it's not present, we'll just use the mangled name. @@ -364,12 +374,124 @@ class InternalSymbolizer { #endif // SANITIZER_SUPPORTS_WEAK_HOOKS +#if SANITIZER_LIBBACKTRACE +namespace { + +struct SymbolizeCodeData { + AddressInfo *frames; + uptr n_frames; + uptr max_frames; + const char *module_name; + uptr module_offset; +}; + +extern "C" { + +static int SymbolizeCodePCInfoCallback(void *vdata, uintptr_t addr, + const char *filename, int lineno, + const char *function) { + SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata; + if (function) { +AddressInfo *info = &cdata->frames[cdata->n_frames++]; +info->Clear(); +info->FillAddressAndModuleInfo(addr, cdata->module_name, cdata->module_offset); +info->function = internal_strdup(function); +if (filename) + info->file = internal_strdup(filename); +info->line = lineno; + } + return 0; +} + +static void SymbolizeCodeCallback(void *vdata, uintptr_t addr, + const char *symname, uintptr_t) { + SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata; + if (symname) { +AddressInfo *info = &cdata->frames[0]; +info->Clear(); +info->FillAddressAndModuleInfo(addr, cdata->module_name, cdata->module_offset); +info->function = internal_strdup(symname); +cdata->n_frames = 1; + } +} + +static void SymbolizeDataCallback(void *vdata, uintptr_t, + const char *symname, uintptr_t symval) { + DataInfo *info = (DataInfo *)vdata; + if (symname && symval) { +info->name = internal_strdup(symname); +info->start = symval; + } +} + +static void ErrorCallback(void *, const char *, int) { +} + +} + +} + +class LibbacktraceSymbolizer { + public: + static LibbacktraceSymbolizer *get(LowLevelAllocator *alloc) { +backtrace_state *state + = backtrace_create_state("/proc/self/exe", 1, ErrorCallback, NULL); +if (!state) + return 0; +return new(*alloc) LibbacktraceSymbolizer(state); + } + + uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames, +const char *module_name, uptr module_offset) { +SymbolizeCodeData data; +data.frames = frames; +data.n_frames = 0; +data.max_frames = max_frames; +data.module_name = module_name; +data.module_offset = module_offset; +backtrace_pcinfo(state_, addr, SymbolizeCodePCInfoCallback, ErrorCallback, +&data); +if (data.n_frames) + return data.n_frames; +backtrace_syminfo(state_, addr, SymbolizeCodeCallback, ErrorCallback, &data); +return data.n_frames; + } + + void SymbolizeData(DataInfo *info) { +backtrace_syminfo(state_, info->address, SymbolizeDataCallback, + ErrorCallback, info); + } + + private: + LibbacktraceSymbolizer(backtrace_state *state) : state_(state) { } + + backtrace_state *state_; // Leaked. +}; +#else +class LibbacktraceSymbolizer { + public: + static LibbacktraceSymbolizer *get(LowLevelAllocator *) { +return 0; + } + + uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames, +const char *module_name, uptr module_offset) { +return 0; + } + + void SymbolizeData(DataInfo *in
Re: [PATCH] Fix libbacktrace on prelinked shared libraries
On Mon, Nov 18, 2013 at 5:11 AM, Jakub Jelinek wrote: > > info->dlpi_addr == 0 is true for executables (but those have also > info->dlpi_name set to ""), but not just for those, it is true also > for successfully prelinked shared libraries or any other libraries that > have been linked at non-zero offset and the dynamic linker managed to mmap > them where they were prelinked resp. linked to. > > BTW, I'm surprised you aren't adding base_address (well, in fact it is > actually not base address, but load bias, difference between load address > and link base address) to elf_initialize_syminfo created symbol addresses, > because all of them should be biased by the load address. > And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols, > those shouldn't be biased but not even entered into the table. You're right, these are bugs. > 2013-11-18 Jakub Jelinek > > * elf.c (phdr_callback): Process info->dlpi_addr == 0 normally. This is OK. Thanks. Ian
[PATCH] Add reference binding instrumentation
This incremental patch brings the implementation of reference binding instrumentation, that is, it detects cases like int *p = NULL; int &r = *p; or auto &&rr = *p; and similarly. It does so by adding a COMPOUND_EXPR to the decl. As standard says, "A reference shall be initialized to refer to a valid object or function.". Jason, is that tiny C++ part ok with you? Regtested/bootstrapped on x86_64-linux, ran bootstrap-ubsan, ok for trunk? 2013-11-18 Marek Polacek c-family/ * c-ubsan.h (ubsan_instrument_reference): Declare. * c-ubsan.c (ubsan_instrument_reference): New function. cp/ * decl.c (cp_finish_decl): Instrument reference binding. testsuite/ * g++.dg/ubsan/null-1.C: New test. --- gcc/c-family/c-ubsan.h.mp2 2013-11-18 12:52:00.572671736 +0100 +++ gcc/c-family/c-ubsan.h 2013-11-18 12:52:25.751761970 +0100 @@ -24,5 +24,6 @@ along with GCC; see the file COPYING3. extern tree ubsan_instrument_division (location_t, tree, tree); extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree); extern tree ubsan_instrument_vla (location_t, tree); +extern tree ubsan_instrument_reference (location_t, tree); #endif /* GCC_C_UBSAN_H */ --- gcc/c-family/c-ubsan.c.mp2 2013-11-18 12:49:00.553025438 +0100 +++ gcc/c-family/c-ubsan.c 2013-11-18 15:26:53.835120271 +0100 @@ -179,3 +179,30 @@ ubsan_instrument_vla (location_t loc, tr return t; } + +/* Instrument reference binding, that is, ensure that the reference + declaration doesn't bind the reference to a NULL pointer. */ + +tree +ubsan_instrument_reference (location_t loc, tree init) +{ + if (!INDIRECT_REF_P (init)) +/* This may happen, e.g. int &&r4 = p;, so don't put an assert here. */ +return init; + + init = TREE_OPERAND (init, 0); + tree eq_expr = fold_build2 (EQ_EXPR, boolean_type_node, init, + build_zero_cst (TREE_TYPE (init))); + const struct ubsan_mismatch_data m += { build_zero_cst (pointer_sized_int_node), + build_int_cst (unsigned_char_type_node, UBSAN_REF_BINDING)}; + tree data = ubsan_create_data ("__ubsan_null_data", +loc, &m, +ubsan_type_descriptor (TREE_TYPE (init), + true), NULL_TREE); + data = build_fold_addr_expr_loc (loc, data); + tree fn = builtin_decl_implicit (BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH); + fn = build_call_expr_loc (loc, fn, 2, data, + build_zero_cst (pointer_sized_int_node)); + return fold_build3 (COND_EXPR, void_type_node, eq_expr, fn, void_zero_node); +} --- gcc/cp/decl.c.mp2 2013-11-15 17:14:24.887512640 +0100 +++ gcc/cp/decl.c 2013-11-18 13:38:15.356077696 +0100 @@ -6216,6 +6217,11 @@ cp_finish_decl (tree decl, tree init, bo if (decl_maybe_constant_var_p (decl)) TREE_CONSTANT (decl) = 1; } + if (flag_sanitize & SANITIZE_NULL + && TREE_CODE (type) == REFERENCE_TYPE) + init = fold_build2 (COMPOUND_EXPR, TREE_TYPE (init), + ubsan_instrument_reference (input_location, init), + init); } if (processing_template_decl) --- gcc/testsuite/g++.dg/ubsan/null-1.C.mp2 2013-11-18 15:29:44.722831910 +0100 +++ gcc/testsuite/g++.dg/ubsan/null-1.C 2013-11-18 15:29:54.744874505 +0100 @@ -0,0 +1,23 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=null -w -std=c++11" } */ + +typedef const long int L; + +int +main (void) +{ + int *p = 0; + L *l = 0; + + int &r = *p; + auto &r2 = *p; + L &lr = *l; + + /* Try an rvalue reference. */ + auto &&r3 = *p; +} + +/* { dg-output "reference binding to null pointer of type 'int'(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*reference binding to null pointer of type 'int'(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*reference binding to null pointer of type 'const L'(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*reference binding to null pointer of type 'int'(\n|\r\n|\r)" } */ Marek
Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support
On 11/08/13 02:02, Ilya Enkovich wrote: Hi, Here is an updated patch version with no langhook. Regarding TLS objects issue - I do not think compiler should compensate the absence of instrumentation in libraries. Compiler should be responsible for initialization of Bounds Tables for .tdata section. Correct data copy is a responsibility of library. User should use either instrumented library or wrapper calls if he needs this functionality. Thanks, Ilya -- gcc/ 2013-11-06 Ilya Enkovich * c/c-parser.c: Include tree-chkp.h. (c_parser_declaration_or_fndef): Register statically initialized decls in Pointer Bounds Checker. * cp/decl.c: Include tree-chkp.h. (cp_finish_decl): Register statically initialized decls in Pointer Bounds Checker. * gimplify.c: Include tree-chkp.h. (gimplify_init_constructor): Register statically initialized decls in Pointer Bounds Checker. Is parsing really the right time to register these things with the checking framework? Doesn't all this stuff flow through the gimplifier? If so wouldn't that be a better place? If it can be done in the gimplifier, which seems good from the standpoint of simplifying the long term maintenance of the checking code. If there's a good reason to have this front-end, please explain it. Thanks, Jeff
Re: [PATCH] Fix libbacktrace on prelinked shared libraries
On Mon, Nov 18, 2013 at 08:29:05AM -0800, Ian Lance Taylor wrote: > On Mon, Nov 18, 2013 at 5:11 AM, Jakub Jelinek wrote: > > > > info->dlpi_addr == 0 is true for executables (but those have also > > info->dlpi_name set to ""), but not just for those, it is true also > > for successfully prelinked shared libraries or any other libraries that > > have been linked at non-zero offset and the dynamic linker managed to mmap > > them where they were prelinked resp. linked to. > > > > BTW, I'm surprised you aren't adding base_address (well, in fact it is > > actually not base address, but load bias, difference between load address > > and link base address) to elf_initialize_syminfo created symbol addresses, > > because all of them should be biased by the load address. > > And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols, > > those shouldn't be biased but not even entered into the table. > > You're right, these are bugs. So what about this fix then? Tested on a short testcase that links against a shared library without debug info and looks up the symbol using backtrace_syminfo. make check in libbacktrace also passes. 2013-11-18 Jakub Jelinek * elf.c (SHN_UNDEF): Define. (elf_initialize_syminfo): Add base_address argument. Ignore symbols with st_shndx == SHN_UNDEF. Add base_address to address fields. (elf_add): Adjust caller. --- libbacktrace/elf.c.jj 2013-11-18 12:48:07.0 +0100 +++ libbacktrace/elf.c 2013-11-18 17:41:38.903835173 +0100 @@ -98,6 +98,7 @@ dl_iterate_phdr (int (*callback) (struct #undef EV_CURRENT #undef SHN_LORESERVE #undef SHN_XINDEX +#undef SHN_UNDEF #undef SHT_SYMTAB #undef SHT_STRTAB #undef SHT_DYNSYM @@ -183,6 +184,7 @@ typedef struct { b_elf_wxword sh_entsize; /* Entry size if section holds table */ } b_elf_shdr; /* Elf_Shdr. */ +#define SHN_UNDEF 0x /* Undefined section */ #define SHN_LORESERVE 0xFF00 /* Begin range of reserved indices */ #define SHN_XINDEX 0x /* Section index is held elsewhere */ @@ -342,6 +344,7 @@ elf_symbol_search (const void *vkey, con static int elf_initialize_syminfo (struct backtrace_state *state, + uintptr_t base_address, const unsigned char *symtab_data, size_t symtab_size, const unsigned char *strtab, size_t strtab_size, backtrace_error_callback error_callback, @@ -365,7 +368,8 @@ elf_initialize_syminfo (struct backtrace int info; info = sym->st_info & 0xf; - if (info == STT_FUNC || info == STT_OBJECT) + if ((info == STT_FUNC || info == STT_OBJECT) + && sym->st_shndx != SHN_UNDEF) ++elf_symbol_count; } @@ -385,6 +389,8 @@ elf_initialize_syminfo (struct backtrace info = sym->st_info & 0xf; if (info != STT_FUNC && info != STT_OBJECT) continue; + if (sym->st_shndx == SHN_UNDEF) + continue; if (sym->st_name >= strtab_size) { error_callback (data, "symbol string index out of range", 0); @@ -393,7 +399,7 @@ elf_initialize_syminfo (struct backtrace return 0; } elf_symbols[j].name = (const char *) strtab + sym->st_name; - elf_symbols[j].address = sym->st_value; + elf_symbols[j].address = sym->st_value + base_address; elf_symbols[j].size = sym->st_size; ++j; } @@ -733,7 +739,7 @@ elf_add (struct backtrace_state *state, if (sdata == NULL) goto fail; - if (!elf_initialize_syminfo (state, + if (!elf_initialize_syminfo (state, base_address, symtab_view.data, symtab_shdr->sh_size, strtab_view.data, strtab_shdr->sh_size, error_callback, data, sdata)) Jakub
Re: [PATCH] Implement -fsanitize=null + new sanopt pass
On Mon, 18 Nov 2013, Marek Polacek wrote: > +@item @option{-fsanitize=shift} > + > +This option enables checking that the result of a shift operation is > +not undefined. Note that what exactly is considered undefined differs > +slightly between C and C++, as well as between ANSI C and C99, etc. We generally refer to ISO C90, not ANSI C. > +Detect integer division by zero as well as @code{INT_MIN / -1} division. > +Note that the latter is only made undefined from C99 onwards. INT_MIN / -1 is unambiguously undefined in C90 - it's a signed arithmetic overflow (result not within the range of its type). It's INT_MIN % -1 where there's more ambiguity, but I consider the wording changes in C11 as a defect correction that should be applied back to C90. (A comment on what the semantics should be, not on whether the documentation accurately reflects the code.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer
On Mon, Nov 18, 2013 at 5:39 AM, Jakub Jelinek wrote: > > I've kept the possibility to override the symbolizer by magic symbols > (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc. > works and what is the possibility there to add libbacktrace, this just > requires users to compile with SANITIZE_LIBBACKTRACE defined to signal > that backtrace-supported.h and backtrace.h is available and the > sanitizer_symbolizer_posix_libcdep.cc source then decides based on > backtrace-supported.h etc. whether it is usable. Just FYI for those who don't know, libbacktrace is under a BSD license so there should be no licensing objection to using it in LLVM. > 1) right now libbacktrace performs dl_iterate_phdr only the first time >backtrace_pcinfo or backtrace_syminfo is called, if there are some >dlopens/dlcloses in between that and another querying of the symbolizer, >it won't notice that. Perhaps it can be done only when we don't >find a symbol and/or have some function that tries to dl_iterare_phdr >again, look at cached st_ino/st_mtime or similar, and for threaded >version likely just keep old records, just add a flag to them that they >should be ignored (or say atomically decrease symbol count to zero >and something similar for debug info). Yes, this is a FIXME in dwarf_fileline in libbacktrace/dwarf.c. > 2) for tsan querying of data symbols, apparently the classes want to see >not just the symbol name and start value, but also size. libbacktrace >has all this info available, just doesn't pass it down to the callback. >I wonder if we'd need to create yet another libbacktrace entrypoint, or >if it would be acceptable to do source code incompatible, ABI (at least >on all sane targets) compatible version of just adding another >uintptr_t symsize argument to backtrace_syminfo_callback. I think it would be fine to change the callback. I doubt that libbacktrace is so widely used that we need to worry about backward compatibility at this stage. In particular I imagine that any users of libbacktrace are simply copying the source code, since there is no installable package. > 3) I wonder if libbacktrace couldn't be updated to use __atomic_* builtins, >then it could avoid the ugliness to emulate atomic loads and stores. I think that would be fine. I can't remember why I didn't use the __atomic builtins. Maybe they weren't quite ready at the time. Ian
Re: [PATCH] Fix libbacktrace on prelinked shared libraries
On Mon, Nov 18, 2013 at 8:46 AM, Jakub Jelinek wrote: > > 2013-11-18 Jakub Jelinek > > * elf.c (SHN_UNDEF): Define. > (elf_initialize_syminfo): Add base_address argument. Ignore symbols > with st_shndx == SHN_UNDEF. Add base_address to address fields. > (elf_add): Adjust caller. This is OK. Thanks. Ian
Re: Add value range support into memcpy/memset expansion
On Mon, Nov 18, 2013 at 6:19 AM, H.J. Lu wrote: > On Sun, Nov 17, 2013 at 3:38 PM, Jan Hubicka wrote: >> Hi, >> this is version I comitted. It also adds a testcase and enables the support >> in i386 backend. >> >> Honza >> >> * doc/md.texi (setmem, movstr): Update documentation. >> * builtins.c (determine_block_size): New function. >> (expand_builtin_memcpy): Use it and pass it to >> emit_block_move_hints. >> (expand_builtin_memset_args): Use it and pass it to >> set_storage_via_setmem. >> * expr.c (emit_block_move_via_movmem): Add min_size/max_size >> parameters; >> update call to expander. >> (emit_block_move_hints): Add min_size/max_size parameters. >> (clear_storage_hints): Likewise. >> (set_storage_via_setmem): Likewise. >> (clear_storage): Update. >> * expr.h (emit_block_move_hints, clear_storage_hints, >> set_storage_via_setmem): Update prototype. >> * i386.c (ix86_expand_set_or_movmem): Add bounds; export. >> (ix86_expand_movmem, ix86_expand_setmem): Remove. >> (ix86_expand_movmem, ix86_expand_setmem): Remove. >> * i386.md (movmem, setmem): Pass parameters. >> >> * testsuite/gcc.target/i386/memcpy-2.c: New testcase. > > The new testcase fails for me on x86 and x86-64: > > FAIL: gcc.target/i386/memcpy-2.c scan-assembler-not memcpy > FAIL: gcc.target/i386/memcpy-2.c (test for excess errors) > I got [hjl@gnu-6 gcc]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -ffat-lto-objects -ffat-lto-objects -S -m32 -o memcpy-2.s /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c: In function ‘t’: /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c:10:5: warning: incompatible implicit declaration of built-in function ‘memcpy’ [enabled by default] [hjl@gnu-6 gcc]$ cat memcpy-2.s .file"memcpy-2.c" .text .p2align 4,,15 .globlt .typet, @function t: .LFB0: .cfi_startproc subl$28, %esp .cfi_def_cfa_offset 32 movl32(%esp), %eax cmpl$9, %eax ja.L3 movl%eax, 8(%esp) movlb, %eax movl%eax, 4(%esp) movla, %eax movl%eax, (%esp) callmemcpy .L3: addl$28, %esp .cfi_def_cfa_offset 4 ret .cfi_endproc /* { dg-final { scan-assembler-not "memcpy" } } */ will also match .file"memcpy-2.c" -- H.J.