Re: [PATCH] Fix static libasan link

2013-11-18 Thread Yury Gribov

> Ok with those changes.
>Jakub

Thanks. Committed revision 204934.

-Y


Re: Revert libsanitizer patches or fix 59009

2013-11-18 Thread Kostya Serebryany
On Sat, Nov 16, 2013 at 7:59 AM, John David Anglin  wrote:
> As far as I can tell, libsanitizer works on hppa-linux.  So, the change
> could be added to the llvm tree.
> However, I'm unlikely to test anything in the tree unless someone tells me
> there's something to test.

Submitted the hppa patch as
http://llvm.org/viewvc/llvm-project?view=revision&revision=194995.

>
> Dave
>
>
> On 15-Nov-13, at 10:52 PM, Konstantin Serebryany wrote:
>
>> Dave,
>>
>> Do you want the asan/asan_linux.cc (# elif defined(__hppa__)) part to
>> be in the llvm tree?
>>
>> --kcc
>>
>> On Sat, Nov 16, 2013 at 3:55 AM, John David Anglin 
>> wrote:
>>>
>>> On 15-Nov-13, at 9:51 AM, Jakub Jelinek wrote:
>>>
 On Fri, Nov 15, 2013 at 08:16:47AM -0600, Peter Bergner wrote:
>
>
> On Wed, 2013-11-13 at 11:25 -0600, Peter Bergner wrote:
>>
>>
>> On Wed, 2013-11-13 at 00:49 +0100, Jakub Jelinek wrote:
>>>
>>>
>>> 2013-11-12  Jakub Jelinek  
>>>
>>>* sanitizer_common/sanitizer_platform_limits_linux.cc:
>>> Temporarily
>>>ifdef out almost the whole source.
>>>* sanitizer_common/sanitizer_common_syscalls.inc: Likewise.
>>
>>
>>
>> That helps, but as Pat reported in the bugzilla, it still is failing.
>> With the following patch, we can now bootstrap on powerpc64-linux.
>>
>> Is this ok for trunk?
>>
>> Does this help the other architectures that are failing for the same
>> build error?
>
>
>
> Ok, Dave reported in PR59009 that my last patch still left a few build
> problems on HPPA.  Dave tested the patch below and confirmed this
> cleans



 How can there be problems on HPPA?  libsanitizer/configure.tgt says that
 hppa* is UNSUPPORTED, so libsanitizer should never be built there.
 Furthermore, it would be nice to understand why the sigaction is
 different.
>>>
>>>
>>>
>>>
>>> Actually, it turns out I have had a patch in my tree enabling it.
>>>
>>> Dave
>>> --
>>> John David Anglin   dave.ang...@bell.net
>>>
>>>
>>
>
> --
> John David Anglin   dave.ang...@bell.net
>
>
>


Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p

2013-11-18 Thread Yvan Roux
So, the validation is ok with this patch, I'm just not able to say if
the original performance issue is still fixed with it.  Could you
check it Kyrylo ?

Yvan

2013-11-17  Yvan Roux  

* config/arm/arm.md (store_minmaxsi): Use only when
optimize_function_for_size_p.


On 15 November 2013 15:59, Yvan Roux  wrote:
>> Sometimes 4 will be needed, since both original register values may
>> remain live.
>
> Indeed.
>
>> However, I'm inclined to agree that while it should be possible to
>> decide at the *function* level whether or not an insn is valid, doing so
>> at the block level is probably unsafe.
>
> Ok, so the attached patch should fix the issue, its validation is ongoing.


Re: [PATCH, ARM, LRA] Fixed bootstrap failure in Thumb mode

2013-11-18 Thread Yvan Roux
Ping.

On 7 November 2013 15:56, Yvan Roux  wrote:
> Hi,
>
> this patch fixed an LRA cycling due to secondary reload (Thumb mode).
> Notice that this patch is a prerequisite to turn on LRA by default on
> ARM.  Bootstrapped on a9 and a15 without any regression in the
> testsuite as LRA is off by default and with the regression reported in
> the thread bellow when LRA is on.
>
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html
>
> Thanks,
> Yvan
>
> 2013-11-07  Yvan Roux  
>
> * config/arm/arm.h (THUMB_SECONDARY_INPUT_RELOAD_CLASS): Return 
> NO_REGS
> for LRA.


[Patch, Fortran, OOP] PR 59143: Bogus warning with array-valued type-bound procedure

2013-11-18 Thread Janus Weil
Hi all,

here is a straightforward patch to teach 'get_expr_storage_size' about
type-bound procedures (which are handled internally as
procedure-pointer components of the corresponding vtab). In that sense
the patch should handle both TBPs as well as PPCs.

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus



2013-11-18  Janus Weil  

PR fortran/59143
* interface.c (get_expr_storage_size): Handle array-valued type-bound
procedures.

2013-11-18  Janus Weil  

PR fortran/59143
* gfortran.dg/typebound_proc_30.f90: New.
Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 204897)
+++ gcc/fortran/interface.c (working copy)
@@ -2426,6 +2426,17 @@ get_expr_storage_size (gfc_expr *e)
- mpz_get_si (ref->u.ar.as->lower[i]->value.integer));
}
 }
+  else if (ref->type == REF_COMPONENT && ref->u.c.component->attr.function
+  && ref->u.c.component->attr.proc_pointer
+  && ref->u.c.component->attr.dimension)
+   {
+ /* Array-valued procedure-pointer components.  */
+ gfc_array_spec *as = ref->u.c.component->as;
+ for (i = 0; i < as->rank; i++)
+   elements = elements
+ * (mpz_get_si (as->upper[i]->value.integer)
+ - mpz_get_si (as->lower[i]->value.integer) + 1L);
+   }
 }
 
   if (substrlen)
! { dg-do compile }
!
! PR 59143: [OOP] Bogus warning with array-valued type-bound procedure
!
! Contributed by Jürgen Reuter 

module phs_single

  type :: phs_single_t
   contains
 procedure, nopass :: decay_p 
  end type
  
contains

  subroutine evaluate (phs)
class(phs_single_t) :: phs
call func1 (phs%decay_p ())
  end subroutine

  subroutine func1 (p)
real :: p(2)
  end subroutine
  
  function decay_p ()
real :: decay_p(2)
decay_p = 1.
  end function

end module

! { dg-final { cleanup-modules "phs_single" } }


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread Arnaud Charlet
> Yes, it passed all tests with -m32, -mx32, -m64 on Linux/x86-64.
> Installed on trunk. Is this OK to backport to 4.8 branch after
> a few days?

Actually there are two issues with your change:

Using time_t for tv_nsec looks actually wrong, the definition on (my)
linux is:

struct timespec {
__kernel_time_t tv_sec; /* seconds */
longtv_nsec;/* nanoseconds */
};

As you can see, the types for tv_sec and tv_nsec are different.

So the change to timespec looks wrong to me.

Also changing tv_nsec type breaks s-osinte-solaris-posix.ads which would
need a similar change, although not if we change the above.

Can you please follow up on that?

In particular, what the definition of timespec on x32 linux?

Shouldn't tv_nsec be back to a long as it was before? Looks to me this
way.

Arno


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread Arnaud Charlet
> Actually there are two issues with your change:
> 
> Using time_t for tv_nsec looks actually wrong, the definition on (my)
> linux is:
> 
> struct timespec {
> __kernel_time_t tv_sec; /* seconds */
> longtv_nsec;/* nanoseconds */
> };
> 
> As you can see, the types for tv_sec and tv_nsec are different.
> 
> So the change to timespec looks wrong to me.

In addition concerning s-taprop-linux.adb, the definition of struct
timeval on linux is:

/* A time value that is accurate to the nearest
   microsecond but also has a range of years.  */
struct timeval
  {
__time_t tv_sec;/* Seconds.  */
__suseconds_t tv_usec;  /* Microseconds.  */
  };

So again, we have another type (__suseconds_t), and NOT time_t or __time_t
so it's really wrong for tv_nsec to use time_t.

Arno


Re: [0/10] Replace host_integerp and tree_low_cst

2013-11-18 Thread Richard Sandiford
Jeff Law  writes:
> On 11/16/13 05:53, Richard Sandiford wrote:
>> After the patch that went in yesterday, all calls to host_integerp and
>> tree_low_cst pass a constant "pos" argument.  This series replaces each
>> function with two separate ones:
> [ ... ]
> So I've almost entirely ignored the whole wide-int conversion discussion 
> and I suspect I'm not entirely alone.
>
> Can you briefly summarize what's y'all are trying to accomplish with the 
> wide-int changes?

At the moment, we can only handle tree and rtl integer constants that
fit in 2 HOST_WIDE_INTs.  The idea is to remove that limit.  E.g. things
like OImode (used in a few ports) will become a first-class citizen,
with all OImode values being representable.

Besides that headline reason, there are various side benefits.  E.g.:

- All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in
  "infinite" precision, which isn't possible for 128-bit constants today.
  (I.e. there's no way to distinguish signed and unsigned 128-bit constants
  in a double_int.)

- Wider-than-2-HWI intermediate results can be represented as a single
  integer.  I'm told this is useful for VRP.  (wide-int is mostly Kenny
  and Mike's work, I've just been butting in recently.)

- rtl-level constant folding can use the same code to handle all
  combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT,
  on converted ports).  At the moment we handle CONST_INT cases
  specially, and don't try as hard with CONST_DOUBLEs.

Implementation-wise, it tries to make it so that the common single-HWI
cases are still fast.

Thanks,
Richard


Re: Pass floating point values on powerpc64 as per ABI

2013-11-18 Thread Andreas Schwab
Alan Modra  writes:

> diff -urp gcc3/libffi/testsuite/libffi.call/cls_double_va.c 
> gcc4/libffi/testsuite/libffi.call/cls_double_va.c
> --- gcc3/libffi/testsuite/libffi.call/cls_double_va.c 2013-11-15 
> 23:03:07.193964372 +1030
> +++ gcc4/libffi/testsuite/libffi.call/cls_double_va.c 2013-11-15 
> 23:22:51.383884118 +1030
> @@ -38,26 +38,24 @@ int main (void)
>  
>   /* This printf call is variadic */
>   CHECK(ffi_prep_cif_var(&cif, FFI_DEFAULT_ABI, 1, 2, &ffi_type_sint,
> - arg_types) == FFI_OK);
> +arg_types) == FFI_OK);
>  
>   args[0] = &format;
>   args[1] = &doubleArg;
>   args[2] = NULL;
>  
>   ffi_call(&cif, FFI_FN(printf), &res, args);
> - // { dg-output "7.0" }
> + /* { dg-output "7.0" } */
>   printf("res: %d\n", (int) res);
> - // { dg-output "\nres: 4" }
> + /* { dg-output "\nres: 4" } */
>  
> - /* The call to cls_double_va_fn is static, so have to use a normal 
> prep_cif */
> - CHECK(ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 2, &ffi_type_sint, arg_types) 
> == FFI_OK);

This breaks ia64:

FAIL: libffi.call/cls_double_va.c -O0 -W -Wall output pattern test, is 7.0
res: 4
0.0
res: 4
, should match 7.0?
res: 4?
7.0?
res: 4

FAIL: libffi.call/cls_longdouble_va.c -O0 -W -Wall output pattern test, is 7.0
res: 4
0.0
res: 4
, should match 7.0?
res: 4?
7.0?
res: 4

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[AArch64] AArch64 SIMD Builtins Better Type Correctness.

2013-11-18 Thread James Greenhalgh

Hi,

This patch adds infrastructure to allow us to model the correct
types for the intrinsics in arm_neon.h. Thus far we have relied
on ugly casts between data types, and treated every intrinsic
as taking 'signed' vectors.

So we have intrinsics in arm_neon.h which look like:

uint32x4_t
vaddw_high_u16 (uint32x4_t __a, uint16x8_t __b)
{
  return (uint32x4_t) __builtin_aarch64_uaddw2v8hi ((int32x4_t) __a,
(int16x8_t) __b);
}

What we would really like is to remove these casts so we get something
more like:

uint32x4_t
vaddw_high_u16 (uint32x4_t __a, uint16x8_t __b)
{
  return __builtin_aarch64_uaddw2v8hi_uuu (__a, __b);
}

This is much easier on the eye, and less error prone.

This patch adds some infrastructure to encode the type signatures
for the functions and to construct vector types as we need them.

I've tested the patch on aarch64-none-elf with no regressions. And
put it up against an internal testsuite for Neon intrinsics with no
problems.

OK?

Thanks,
James

---
gcc/

2013-11-18  James Greenhalgh  

* gcc/config/aarch64/aarch64-builtins.c
(aarch64_simd_itype): Remove.
(aarch64_simd_builtin_datum): Remove itype, add
qualifiers pointer.
(VAR1): Use qualifiers.
(aarch64_build_scalar_type): New.
(aarch64_build_vector_type): Likewise.
(aarch64_build_type): Likewise.
(aarch64_init_simd_builtins): Refactor, remove special cases,
consolidate main loop.
(aarch64_simd_expand_args): Likewise.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 2f1a8d0..28f576d 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -81,57 +81,101 @@ enum aarch64_simd_builtin_type_mode
 
 #define UP(X) X##_UP
 
-typedef enum
+#define SIMD_MAX_BUILTIN_ARGS 5
+
+enum aarch64_type_qualifiers
 {
-  AARCH64_SIMD_BINOP,
-  AARCH64_SIMD_TERNOP,
-  AARCH64_SIMD_QUADOP,
-  AARCH64_SIMD_UNOP,
-  AARCH64_SIMD_GETLANE,
-  AARCH64_SIMD_SETLANE,
-  AARCH64_SIMD_CREATE,
-  AARCH64_SIMD_DUP,
-  AARCH64_SIMD_DUPLANE,
-  AARCH64_SIMD_COMBINE,
-  AARCH64_SIMD_SPLIT,
-  AARCH64_SIMD_LANEMUL,
-  AARCH64_SIMD_LANEMULL,
-  AARCH64_SIMD_LANEMULH,
-  AARCH64_SIMD_LANEMAC,
-  AARCH64_SIMD_SCALARMUL,
-  AARCH64_SIMD_SCALARMULL,
-  AARCH64_SIMD_SCALARMULH,
-  AARCH64_SIMD_SCALARMAC,
-  AARCH64_SIMD_CONVERT,
-  AARCH64_SIMD_FIXCONV,
-  AARCH64_SIMD_SELECT,
-  AARCH64_SIMD_RESULTPAIR,
-  AARCH64_SIMD_REINTERP,
-  AARCH64_SIMD_VTBL,
-  AARCH64_SIMD_VTBX,
-  AARCH64_SIMD_LOAD1,
-  AARCH64_SIMD_LOAD1LANE,
-  AARCH64_SIMD_STORE1,
-  AARCH64_SIMD_STORE1LANE,
-  AARCH64_SIMD_LOADSTRUCT,
-  AARCH64_SIMD_LOADSTRUCTLANE,
-  AARCH64_SIMD_STORESTRUCT,
-  AARCH64_SIMD_STORESTRUCTLANE,
-  AARCH64_SIMD_LOGICBINOP,
-  AARCH64_SIMD_SHIFTINSERT,
-  AARCH64_SIMD_SHIFTIMM,
-  AARCH64_SIMD_SHIFTACC
-} aarch64_simd_itype;
+  /* T foo.  */
+  qualifier_none = 0x0,
+  /* unsigned T foo.  */
+  qualifier_unsigned = 0x1, /* 1 << 0  */
+  /* const T foo.  */
+  qualifier_const = 0x2, /* 1 << 1  */
+  /* T *foo.  */
+  qualifier_pointer = 0x4, /* 1 << 2  */
+  /* const T *foo.  */
+  qualifier_const_pointer = 0x6, /* qualifier_const | qualifier_pointer  */
+  /* Used when expanding arguments if an operand could
+ be an immediate.  */
+  qualifier_immediate = 0x8, /* 1 << 3  */
+  qualifier_maybe_immediate = 0x10, /* 1 << 4  */
+  /* void foo (...).  */
+  qualifier_void = 0x20, /* 1 << 5  */
+  /* Some patterns may have internal operands, this qualifier is an
+ instruction to the initialisation code to skip this operand.  */
+  qualifier_internal = 0x40, /* 1 << 6  */
+  /* Some builtins should use the T_*mode* encoded in a simd_builtin_datum
+ rather than using the type of the operand.  */
+  qualifier_map_mode = 0x80, /* 1 << 7  */
+  /* qualifier_pointer | qualifier_map_mode  */
+  qualifier_pointer_map_mode = 0x84,
+  /* qualifier_const_pointer | qualifier_map_mode  */
+  qualifier_const_pointer_map_mode = 0x86
+};
 
 typedef struct
 {
   const char *name;
-  const aarch64_simd_itype itype;
   enum aarch64_simd_builtin_type_mode mode;
   const enum insn_code code;
   unsigned int fcode;
+  enum aarch64_type_qualifiers *qualifiers;
 } aarch64_simd_builtin_datum;
 
+static enum aarch64_type_qualifiers
+aarch64_types_unop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none };
+#define TYPES_UNOP (aarch64_types_unop_qualifiers)
+#define TYPES_CREATE (aarch64_types_unop_qualifiers)
+#define TYPES_REINTERP (aarch64_types_unop_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_binop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_maybe_immediate };
+#define TYPES_BINOP (aarch64_types_binop_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_ternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_none, qualifier_none };
+#define TYPES_TE

Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p

2013-11-18 Thread Kyrill Tkachov

On 18/11/13 08:37, Yvan Roux wrote:

So, the validation is ok with this patch, I'm just not able to say if
the original performance issue is still fixed with it.  Could you
check it Kyrylo ?


Hi Yvan,
I'll run the benchmark today to confirm the performance, but from compiling some 
code sequences that exhibited the bad behaviour in the past, I see that this 
patch still fixes the issues. store_minmaxsi is not generated when optimising 
for speed.



Thanks,
Kyrill



Yvan

2013-11-17  Yvan Roux  

 * config/arm/arm.md (store_minmaxsi): Use only when
 optimize_function_for_size_p.


On 15 November 2013 15:59, Yvan Roux  wrote:

Sometimes 4 will be needed, since both original register values may
remain live.

Indeed.


However, I'm inclined to agree that while it should be possible to
decide at the *function* level whether or not an insn is valid, doing so
at the block level is probably unsafe.

Ok, so the attached patch should fix the issue, its validation is ongoing.





Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p

2013-11-18 Thread Yvan Roux
> Hi Yvan,
> I'll run the benchmark today to confirm the performance, but from compiling
> some code sequences that exhibited the bad behaviour in the past, I see that
> this patch still fixes the issues. store_minmaxsi is not generated when
> optimising for speed.

Ok Cool, Thanks Kyrill

Cheers,
Yvan


RE: [PATCH, libgcc] Disable JCR section when java is not enabled

2013-11-18 Thread Joey Ye
Ping, as wasting 8 bytes of RAM isn't ignorable on embedded system.

OK to trunk stage 1?

> -Original Message-
> From: Tom Tromey [mailto:tro...@redhat.com]
> Sent: Thursday, October 10, 2013 21:32
> To: Jakub Jelinek
> Cc: Joey Ye; p...@bothner.com; a...@redhat.com; H.J. Lu; gcc-patches; 'Ian
> Lance Taylor'
> Subject: Re: [PATCH, libgcc] Disable JCR section when java is not enabled
> 
> Jakub> Given the state of gcj that it is now only rarely used and most
> Jakub> people just use OpenJDK instead, wouldn't it be a good idea to
> Jakub> just require that gcj code is linked using gcj driver or, if
> Jakub> linked in any other driver, just using a special non-default
> Jakub> option (-flink-jcr or similar), that would be automatically set
> Jakub> by gcj driver, move this JCR stuff out of the normal crt* files
> Jakub> and put it into crtjava*.o instead, and only link in if
> Jakub> -flink-jcr is passed or gcj driver used?  Or treat -lgcj as that
> Jakub> magic switch?
> 
> The irony of the situation is that this would require significantly more
work
> than has gone into gcj in the past N years.
> 
> Jakub> Also, looking at crtstuff.c makes me wonder where are classes
> Jakub> deregistered, there are only calls to _Jv_RegisterClasses, but
> Jakub> never to to deregistration, wonder what happens if you dlclose a
> Jakub> shared library with registered classes.
> 
> I think we never implemented class GC for compiled classes, though it's
hard
> to remember.
> 
> Tom


jcr_disable_non_java-130910.patch
Description: Binary data


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-18 Thread Ilya Enkovich
Ping

2013/11/8 Ilya Enkovich :
> Hi,
>
> Here is an updated patch version with no langhook.
>
> Regarding TLS objects issue - I do not think compiler should compensate the 
> absence of instrumentation in libraries.  Compiler should be responsible for 
> initialization of Bounds Tables for .tdata section.  Correct data copy is a 
> responsibility of library.  User should use either instrumented library or 
> wrapper calls if he needs this functionality.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2013-11-06  Ilya Enkovich  
>
> * c/c-parser.c: Include tree-chkp.h.
> (c_parser_declaration_or_fndef): Register statically
> initialized decls in Pointer Bounds Checker.
> * cp/decl.c: Include tree-chkp.h.
> (cp_finish_decl): Register statically
> initialized decls in Pointer Bounds Checker.
> * gimplify.c: Include tree-chkp.h.
> (gimplify_init_constructor): Register statically
> initialized decls in Pointer Bounds Checker.
>
>
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index 9ccae3b..397b323 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cgraph.h"
>  #include "plugin.h"
>  #include "omp-low.h"
> +#include "tree-chkp.h"
>
>
>  /* Initialization routine for this file.  */
> @@ -1682,6 +1683,12 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
> fndef_ok,
>   maybe_warn_string_init (TREE_TYPE (d), init);
>   finish_decl (d, init_loc, init.value,
>init.original_type, asm_name);
> +
> + /* Register all decls with initializers in Pointer
> +Bounds Checker to generate required static bounds
> +initializers.  */
> + if (DECL_INITIAL (d) != error_mark_node)
> +   chkp_register_var_initializer (d);
> }
> }
>   else
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 1e92f2a..74df02f 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "splay-tree.h"
>  #include "plugin.h"
>  #include "cgraph.h"
> +#include "tree-chkp.h"
>
>  /* Possible cases of bad specifiers type used by bad_specifiers. */
>  enum bad_spec_place {
> @@ -6379,6 +6380,12 @@ cp_finish_decl (tree decl, tree init, bool 
> init_const_expr_p,
>  the class specifier.  */
>   if (!DECL_EXTERNAL (decl))
> var_definition_p = true;
> +
> + /* If var has initilizer then we need to register it in
> +Pointer Bounds Checker to generate static bounds initilizer
> +if required.  */
> + if (DECL_INITIAL (decl) && DECL_INITIAL (decl) != error_mark_node)
> +   chkp_register_var_initializer (decl);
> }
>/* If the variable has an array type, lay out the type, even if
>  there is no initializer.  It is valid to index through the
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 4f52c27..7aaac15 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "vec.h"
>  #include "omp-low.h"
>  #include "gimple-low.h"
> +#include "tree-chkp.h"
>
>  #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name */
>  #include "tree-pass.h" /* FIXME: only for PROP_gimple_any */
> @@ -4111,6 +4112,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq *post_p,
>
> walk_tree (&ctor, force_labels_r, NULL, NULL);
> ctor = tree_output_constant_def (ctor);
> +
> +   /* We need to register created constant object to
> +  initialize bounds for pointers in it.  */
> +   chkp_register_var_initializer (ctor);
> +
> if (!useless_type_conversion_p (type, TREE_TYPE (ctor)))
>   ctor = build1 (VIEW_CONVERT_EXPR, type, ctor);
> TREE_OPERAND (*expr_p, 1) = ctor;


Re: [PATCH, MPX, 2/X] Pointers Checker [9/25] Bound constants

2013-11-18 Thread Ilya Enkovich
Ping

2013/11/7 Ilya Enkovich :
> 2013/11/7 Jeff Law :
>> On 10/31/13 03:15, Ilya Enkovich wrote:
>>>
>>> Hi,
>>>
>>> Here is a patch which adds support for bound constant to be used as
>>> DECL_INITIAL for constant static bounds generated by compiler.
>>>
>>> Thanks,
>>> Ilya
>>> --
>>>
>>> gcc/
>>>
>>> 2013-10-23  Ilya Enkovich  
>>>
>>> * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
>>> * explow.c (trunc_int_for_mode): Likewise.
>>> * varpool.c (ctor_for_folding): Do not fold constant
>>> bounds vars.
>>
>> I'm having a bit of trouble reconciling "add support for bound constant to
>> be used as DECL_INITIAL" rationale text and the actual patch.
>>
>> From reading the patch it appears that you want to allow generation of
>> immediate constants for objects with MODE_POINTER_BOUNDS.  OK, I can see how
>> that is useful.
>>
>> I can kindof see how you want to error out if someone asks for a constant to
>> be truncated to MODE_POINTER_BOUNDS.  Did this trip in practice or is it
>> preemptive?
>
> As far as I remember change in trunc_int_mode was required to expand
> bound constants on 32bit target. Size of the constant is equal to size
> of the HOST_WIDE_INT and thus constant generation goes through
> gen_int_mode and trunc_int_for_mode.
>
>>
>>
>>
>>> diff --git a/gcc/varpool.c b/gcc/varpool.c
>>> index 2eb1fc1..d9c08c1 100644
>>> --- a/gcc/varpool.c
>>> +++ b/gcc/varpool.c
>>> @@ -254,6 +254,12 @@ ctor_for_folding (tree decl)
>>> && TREE_CODE (decl) != CONST_DECL)
>>>   return error_mark_node;
>>>
>>> +  /* Static constant bounds are created to be
>>> + used instead of constants and therefore
>>> + do not let folding it.  */
>>> +  if (POINTER_BOUNDS_P (decl))
>>> +return error_mark_node;
>>
>> Here's the part I'm struggling a bit with.Why did you need this?
>>
>> Isn't this going to prevent that DECL from being used in folding?   The
>> bounds shouldn't really affect that AFAICT.
>
> Bounds constants were introduced only for initialization of constant
> bound vars. Such vars are used to hold commonly used zero bounds (for
> cases when bounds are unknown) values and null bounds (for null
> pointers). Usage of such vars is optional and is controlled via
> compiler flag. It is used to try to decrease overhead on bounds
> creation. E.g. for MPX we need two instructions to create zero bounds
> and also it require one GPR. One of these instructions does not become
> nop when MPX is off which additionally increases overhead.  Having
> constant var we can just load bounds using one MPX instruction.  And
> if I do not prevent folding for these vars then all constant bounds
> vars usages are replaced with immediate bounds constant usages and I
> do not get desired effect.  Since there are no instructions working
> with bounds immediates, I do not see reasons for folding.
>
> Thanks,
> Ilya
>
>>
>> jeff


Re: Some wide-int review comments

2013-11-18 Thread Richard Sandiford
Thanks for the changes.

> @@ -8162,7 +8162,7 @@ fold_builtin_logarithm (location_t loc,
>   /* Prepare to do logN(exp10(exponent) -> exponent*logN(10).  */
>   {
> REAL_VALUE_TYPE dconst10;
> -   real_from_integer (&dconst10, VOIDmode, 10, SIGNED);
> +   real_from_integer (&dconst10, VOIDmode, wi::shwi (10, 32), 
> SIGNED);
> x = build_real (type, dconst10);
>   }
>   exponent = CALL_EXPR_ARG (arg, 0);
> @@ -8315,7 +8315,7 @@ fold_builtin_pow (location_t loc, tree f
>  
>/* Check for an integer exponent.  */
>n = real_to_integer (&c);
> -  real_from_integer (&cint, VOIDmode, n, SIGNED);
> +  real_from_integer (&cint, VOIDmode, wi::shwi (n, 
> HOST_BITS_PER_WIDE_INT), SIGNED);
>if (real_identical (&c, &cint))
>   {
> /* Attempt to evaluate pow at compile-time, unless this should

Are these changes necessary?  The original calls ought to work as-is,
since the function takes a const wide_int_ref &.  Same for the rest of
the patch.

> Index: gcc/c/c-parser.c
> ===
> --- gcc/c/c-parser.c  (revision 204918)
> +++ gcc/c/c-parser.c  (working copy)
> @@ -13375,7 +13375,7 @@ c_parser_cilk_clause_vectorlength (c_par
>|| !TREE_CONSTANT (expr)
>|| !INTEGRAL_TYPE_P (TREE_TYPE (expr)))
>  error_at (loc, "vectorlength must be an integer constant");
> -  else if (exact_log2 (tree_to_hwi (expr)) == -1)
> +  else if (wi::eq_p (wi::exact_log2 (expr), -1))
>  error_at (loc, "vectorlength must be a power of 2");
>else
>  {

FWIW:

  wi::exact_log2 (expr) == -1

should still work.

> Index: gcc/dwarf2out.c
> ===
> --- gcc/dwarf2out.c   (revision 204918)
> +++ gcc/dwarf2out.c   (working copy)
> @@ -13428,8 +13428,6 @@ loc_descriptor (rtx rtl, enum machine_mo
>  
>if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict))
>   {
> -   gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
> -
> /* Note that a CONST_DOUBLE rtx could represent either an integer
>or a floating-point constant.  A CONST_DOUBLE is used whenever
>the constant requires more than one word in order to be

The copy of the CONST_DOUBLE comment is still there though.
(This is handling CONST_WIDE_INT rather than CONST_DOUBLE.)

> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 204918)
> +++ gcc/tree.c(working copy)
> @@ -8536,8 +8536,18 @@ retry:
>  return true;
>  
>/* Third, unsigned integers with top bit set never fit signed types.  */
> -  if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED && wi::neg_p (c))
> -return false;
> +  if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED)
> +{
> +  int uprec = GET_MODE_PRECISION (TYPE_MODE TREE_TYPE (c));
> +  if (uprec < TYPE_PRECISION (TREE_TYPE (c)))
> + {
> +   wide_int x = wi::sext (c, uprec);
> +   if (wi::neg_p (x) || wi::ne_p (x, c))
> + return false;
> + }
> +  else if (wi::neg_p (c))
> + return false;
> +}
>  
>/* If we haven't been able to decide at this point, there nothing more we
>   can check ourselves here.  Look at the base type if we have one and it

I don't really understand this change, but I suppose it's part of
the trunk patch.

Looks good to me otherwise FWIW.

Thanks,
Richard


[PATCH, MPX, 2/X] Pointers Checker [11/25] Expand builtins

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to adopt some builtins expanding to Pointer Bounds Checker.  
Patch mostly disables inlining of instrumented calls to string function.  Also 
adds support for _NOBND_NOCHK versions of string functions (this version does 
not check and copy bounds; therefore may be inlined as regular string function).

Thanks,
Ilya
--
2013-11-13  Ilya Enkovich  

* builtins.c: Include rtl-chkp.h, tree-chkp.h.
(expand_builtin_mempcpy_args): Add orig exp as argument.
Support BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK.
(expand_builtin_mempcpy): Adjust expand_builtin_mempcpy_args call.
(expand_builtin_stpcpy): Likewise.
(expand_builtin_memset_args): Support BUILT_IN_CHKP_MEMSET_NOBND_NOCHK.
(std_expand_builtin_va_start): Initialize bounds for va_list.
(expand_builtin): Support instrumented calls.
* optabs.c: Include rtl-chkp.h.
(expand_unop): Handle bounds in libcall return value.


diff --git a/gcc/builtins.c b/gcc/builtins.c
index 7a04664..b46c364 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -50,6 +50,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "ubsan.h"
 #include "cilk.h"
+#include "tree-chkp.h"
+#include "rtl-chkp.h"
 
 
 static tree do_mpc_arg1 (tree, tree, int (*)(mpc_ptr, mpc_srcptr, mpc_rnd_t));
@@ -115,7 +117,7 @@ static rtx builtin_memcpy_read_str (void *, HOST_WIDE_INT, 
enum machine_mode);
 static rtx expand_builtin_memcpy (tree, rtx);
 static rtx expand_builtin_mempcpy (tree, rtx, enum machine_mode);
 static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx,
-   enum machine_mode, int);
+   enum machine_mode, int, tree);
 static rtx expand_builtin_strcpy (tree, rtx);
 static rtx expand_builtin_strcpy_args (tree, tree, rtx);
 static rtx expand_builtin_stpcpy (tree, rtx, enum machine_mode);
@@ -3195,7 +3197,8 @@ expand_builtin_mempcpy (tree exp, rtx target, enum 
machine_mode mode)
   tree src = CALL_EXPR_ARG (exp, 1);
   tree len = CALL_EXPR_ARG (exp, 2);
   return expand_builtin_mempcpy_args (dest, src, len,
- target, mode, /*endp=*/ 1);
+ target, mode, /*endp=*/ 1,
+ exp);
 }
 }
 
@@ -3207,10 +3210,23 @@ expand_builtin_mempcpy (tree exp, rtx target, enum 
machine_mode mode)
 
 static rtx
 expand_builtin_mempcpy_args (tree dest, tree src, tree len,
-rtx target, enum machine_mode mode, int endp)
+rtx target, enum machine_mode mode, int endp,
+tree orig_exp)
 {
+  tree fndecl = get_callee_fndecl (orig_exp);
+
 /* If return value is ignored, transform mempcpy into memcpy.  */
-  if (target == const0_rtx && builtin_decl_implicit_p (BUILT_IN_MEMCPY))
+  if (target == const0_rtx
+  && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK
+  && builtin_decl_implicit_p (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK))
+{
+  tree fn = builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK);
+  tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3,
+  dest, src, len);
+  return expand_expr (result, target, mode, EXPAND_NORMAL);
+}
+  else if (target == const0_rtx
+  && builtin_decl_implicit_p (BUILT_IN_MEMCPY))
 {
   tree fn = builtin_decl_implicit (BUILT_IN_MEMCPY);
   tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3,
@@ -3394,7 +3410,8 @@ expand_builtin_stpcpy (tree exp, rtx target, enum 
machine_mode mode)
 
   lenp1 = size_binop_loc (loc, PLUS_EXPR, len, ssize_int (1));
   ret = expand_builtin_mempcpy_args (dst, src, lenp1,
-target, mode, /*endp=*/2);
+target, mode, /*endp=*/2,
+exp);
 
   if (ret)
return ret;
@@ -3679,7 +3696,8 @@ expand_builtin_memset_args (tree dest, tree val, tree len,
  do_libcall:
   fndecl = get_callee_fndecl (orig_exp);
   fcode = DECL_FUNCTION_CODE (fndecl);
-  if (fcode == BUILT_IN_MEMSET)
+  if (fcode == BUILT_IN_MEMSET
+  || fcode == BUILT_IN_CHKP_MEMSET_NOBND_NOCHK)
 fn = build_call_nofold_loc (EXPR_LOCATION (orig_exp), fndecl, 3,
dest, val, len);
   else if (fcode == BUILT_IN_BZERO)
@@ -4231,6 +4249,13 @@ std_expand_builtin_va_start (tree valist, rtx nextarg)
 {
   rtx va_r = expand_expr (valist, NULL_RTX, VOIDmode, EXPAND_WRITE);
   convert_move (va_r, nextarg, 0);
+
+  /* We do not have any valid bounds for the pointer, so
+ just store zero bounds for it.  */
+  if (chkp_function_instrumented_p (current_function_decl))
+chkp_expand_bounds_reset_for_mem (valist,
+ make_tree (TREE_TYPE (valist),
+   

[Ada] Internal error on renaming of constant array with -gnatc

2013-11-18 Thread Eric Botcazou
This fixes an ICE on an obscure case with the -gnatc switch.

Tested on x86_64-suse-linux, applied on the mainline.


2013-11-18  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Deal with an
error mark as renamed object in type annotating mode.


-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 204913)
+++ gcc-interface/decl.c	(working copy)
@@ -1117,8 +1117,12 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 			   as we have a VAR_DECL for the pointer we make.  */
 		  }
 
-		gnu_expr = build_unary_op (ADDR_EXPR, gnu_type,
-	   maybe_stable_expr);
+		if (type_annotate_only
+ 			&& TREE_CODE (maybe_stable_expr) == ERROR_MARK)
+		  gnu_expr = NULL_TREE;
+		else
+		  gnu_expr = build_unary_op (ADDR_EXPR, gnu_type,
+		 maybe_stable_expr);
 
 		gnu_size = NULL_TREE;
 		used_by_ref = true;


[PATCH, MPX, 2/X] Pointers Checker [12/25] Expand bounded params

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to expand bounded arguments for calls, input bounds for bounded 
params and returned bounds.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* calls.c: Include tree-chkp.h, rtl-chkp.h.
(arg_data): Add fields for bounds information
and to store value pushed to the stack.
(emit_call_1): Propagate instrumentation flag for CALL.
(precompute_register_parameters): Expand bounds of the arg.
(initialize_argument_information): Fill bounds information.
(load_register_parameters): Load passed bounds.
(expand_call): Handle passed and returned bounds.
(emit_library_call_value_1): Filter out returned bounds.
(store_one_arg): Store bounds of arg.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate instrumentation flag for CALL_EXPR.
(expand_return): Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* explow.c: Include rtl-chkp.h.
(hard_function_value): Handle returned bounds.
* expr.h (store_expr): Add param for bounds target.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr): Likewise.
(store_constructor): Adjust to new store_expr signature.
(store_field): Likewise.
(expand_expr_real_2): Likewise.
(expand_expr_real_1): Likewise.
* tree-outof-ssa.c (insert_value_copy_on_edge): Adjust to new
store_expr signature.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(aggregate_value_p): Handle returned bounds.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parm_data_one): Add field for bounds.
(assign_parm_find_entry_rtl): Fill bounds info.
(assign_parms): Initialize input bounds for args.
(expand_function_start): Handle returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
(expand_function_end): Likewise.
* function.h (rtl_data): Add field for returned bounds.


diff --git a/gcc/calls.c b/gcc/calls.c
index 4dcdb27..7ebf310 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -40,6 +40,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "except.h"
 #include "dbgcnt.h"
+#include "tree-chkp.h"
+#include "rtl-chkp.h"
 
 /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
 #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
@@ -50,12 +52,20 @@ struct arg_data
 {
   /* Tree node for this argument.  */
   tree tree_value;
+  /* Bounds tree node for this argument.  */
+  tree bounds_value;
+  /* Bounds RTL for this argument.  */
+  rtx bounds;
+  /* Slot to be used to pass bounds.  */
+  rtx bounds_slot;
   /* Mode for value; TYPE_MODE unless promoted.  */
   enum machine_mode mode;
   /* Current RTL value for argument, or 0 if it isn't precomputed.  */
   rtx value;
   /* Initially-compute RTL value for argument; only for const functions.  */
   rtx initial_value;
+  /* Pushed value.  */
+  rtx pushed_value;
   /* Register to pass this argument in, 0 if passed on stack, or an
  PARALLEL if the arg is to be copied into multiple non-contiguous
  registers.  */
@@ -387,6 +397,10 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, 
tree fndecl ATTRIBUTE_UNU
   && MEM_EXPR (funmem) != NULL_TREE)
 set_mem_expr (XEXP (call, 0), MEM_EXPR (funmem));
 
+  /* Mark instrumented calls.  */
+  if (call && fntree)
+CALL_EXPR_WITH_BOUNDS_P (call) = CALL_WITH_BOUNDS_P (fntree);
+
   /* Put the register usage information there.  */
   add_function_usage_to (call_insn, call_fusage);
 
@@ -865,6 +879,59 @@ precompute_register_parameters (int num_actuals, struct 
arg_data *args,
  && targetm.small_register_classes_for_mode_p 
(args[i].mode))
 || optimize))
  args[i].value = copy_to_mode_reg (args[i].mode, args[i].value);
+
+   /* Expand argument bounds if any.  */
+   if (args[i].bounds_value)
+ {
+   /* If bounds_value is a list then we pass bounds
+  for a structure.  TREE_VALUE of each node holds
+  base structure address and TREE_PURPOSE holds
+  corresponding offset of the pointer field in
+  structure.  */
+   if (TREE_CODE (args[i].bounds_value) == TREE_LIST)
+ {
+   tree node = args[i].bounds_value;
+   unsigned bnd_num = list_length (args[i].bounds_value);
+   rtx *bounds = XALLOCAVEC (rtx, bnd_num);
+   unsigned bnd_no = 0;
+   tree base_addr = TREE_VALUE (node);
+   rtx base = expand_normal (base_addr);
+
+   /* Expand all nodes in the list.  */
+   while (node)
+  

[Ada] Fix unexpected read of volatile scalar for Out parameter

2013-11-18 Thread Eric Botcazou
This fixes an old issue in the compiler, whereby it unexpectedly generates a 
read of a volatile variable with scalar type passed as Out parameter to a 
subprogram; the subtlety being that the side-effects of the parameter viewed 
as a name still need to be evaluated prior to the call.

Tested on x86_64-suse-linux, applied on the mainline.


2013-11-18  Eric Botcazou  

* gcc-interface/trans.c (Call_to_gnu): For an Out parameter passed by
copy and that don't need to be  copied in, only evaluate its address.


2013-11-18  Eric Botcazou  

* gnat.dg/volatile11.adb: New test.
* gnat.dg/volatile11_pkg.ad[sb]: New helper.


-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 204913)
+++ gcc-interface/trans.c	(working copy)
@@ -4130,9 +4130,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 	gnu_name
 	  = convert (TREE_TYPE (TYPE_FIELDS (TREE_TYPE (gnu_name))), gnu_name);
 
-  /* If we have not saved a GCC object for the formal, it means it is an
-	 Out parameter not passed by reference and that need not be copied in.
-	 Otherwise, first see if the parameter is passed by reference.  */
+  /* First see if the parameter is passed by reference.  */
   if (is_true_formal_parm && DECL_BY_REF_P (gnu_formal))
 	{
 	  if (Ekind (gnat_formal) != E_In_Parameter)
@@ -4178,6 +4176,9 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 	  gnu_formal_type = TREE_TYPE (gnu_formal);
 	  gnu_actual = build_unary_op (ADDR_EXPR, gnu_formal_type, gnu_actual);
 	}
+
+  /* Then see if the parameter is an array passed to a foreign convention
+	 subprogram.  */
   else if (is_true_formal_parm && DECL_BY_COMPONENT_PTR_P (gnu_formal))
 	{
 	  gnu_formal_type = TREE_TYPE (gnu_formal);
@@ -4198,6 +4199,8 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 	 but this is the most likely to work in all cases.  */
 	  gnu_actual = build_unary_op (ADDR_EXPR, gnu_formal_type, gnu_actual);
 	}
+
+  /* Then see if the parameter is passed by descriptor.  */
   else if (is_true_formal_parm && DECL_BY_DESCRIPTOR_P (gnu_formal))
 	{
 	  gnu_actual = convert (gnu_formal_type, gnu_actual);
@@ -4214,6 +4217,8 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 	 (TREE_TYPE (TREE_TYPE (gnu_formal)),
 	  gnu_actual, gnat_actual));
 	}
+
+  /* Otherwise the parameter is passed by copy.  */
   else
 	{
 	  tree gnu_size;
@@ -4221,11 +4226,18 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 	  if (Ekind (gnat_formal) != E_In_Parameter)
 	gnu_name_list = tree_cons (NULL_TREE, gnu_name, gnu_name_list);
 
+	  /* If we didn't create a PARM_DECL for the formal, this means that
+	 it is an Out parameter not passed by reference and that need not
+	 be copied in.  In this case, the value of the actual need not be
+	 read.  However, we still need to make sure that its side-effects
+	 are evaluated before the call, so we evaluate its address.  */
 	  if (!is_true_formal_parm)
 	{
-	  /* Make sure side-effects are evaluated before the call.  */
 	  if (TREE_SIDE_EFFECTS (gnu_name))
-		append_to_statement_list (gnu_name, &gnu_stmt_list);
+		{
+		  tree addr = build_unary_op (ADDR_EXPR, NULL_TREE, gnu_name);
+		  append_to_statement_list (addr, &gnu_stmt_list);
+		}
 	  continue;
 	}
 
-- { dg-do run }
-- { dg-options "-O -gnatp" }

with Volatile11_Pkg; use Volatile11_Pkg;

procedure Volatile11 is

  Value : Integer := 1;
  Bit1 : Boolean := false;
  pragma Volatile (Bit1);
  Bit2 : Boolean := false;
  pragma Volatile (Bit2);
  Bit3 : Boolean := false;
  pragma Volatile (Bit3);
  Bit4 : Boolean := false;
  pragma Volatile (Bit4);
  Bit5 : Boolean := false;
  pragma Volatile (Bit5);
  Bit6 : Boolean := false;
  pragma Volatile (Bit6);
  Bit7 : Boolean := false;
  pragma Volatile (Bit7);
  Bit8 : Boolean := false;
  pragma Volatile (Bit8);

begin
  Bit_Test(Input   => Value,
   Output1 => Bit1,
   Output2 => Bit2,
   Output3 => Bit3,
   Output4 => Bit4,
   Output5 => Bit5,
   Output6 => Bit6,
   Output7 => Bit7,
   Output8 => F.all);

  -- Check that F is invoked before Bit_Test
  if B /= True then
raise Program_Error;
  end if;
end;
package body Volatile11_Pkg is

   procedure Bit_Test(Input : in Integer;
  Output1 : out Boolean; Output2 : out Boolean;
  Output3 : out Boolean; Output4 : out Boolean;
  Output5 : out Boolean; Output6 : out Boolean;
  Output7 : out Boolean; Output8 : out Boolean)  is
  begin
Output8 := B;
Output7 := Input = 7;
Output6 := Input = 6;
Output5 := Input = 5;
Output4 := Input = 4;
Output3 := Input = 3;
Output2 := Input = 2;
Output1 := Input = 1;
  end Bit_Test;

  func

Clean up LTO generation

2013-11-18 Thread Bernd Schmidt
I'm looking into using LTO to pass information between compilers for
different targets, for the OpenACC implementation. This area of the
compiler seems somewhat obfuscated by overengineering, and I'd like to
simplify it first to make it easier to work with.

lto_gimple_out and lto_finish_out aren't real passes, only their
write_summary methods are nonnull. The only thing we really do to emit
LTO is just to make two function calls, to lto_output and
produce_asm_for_decls, but this is wrapped in a lot of pass_manager
boilerplate to confuse the reader. Other downsides are bogus empty dump
files left behind, and plain dead code like

  ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes);

The following patch simplifies this by getting rid of
all_lto_gen_passes. We could simplify a little further if we decided we
really don't need two different timevars for different steps of LTO
output.  Bootstrapped and tested on x86_64-linux, ok?


Bernd
	* cgraphunit.c (ipa_passes): Don't execute all_lto_gen_passes.
	* lto-streamer-out.c (lto_output, produce_asm_for_decls): No longer
	static.
	(pass_data_ipa_lto_gimple_out, pass_ipa_lto_gimple_out,
	make_pass_ipa_lto_gimple_out, pass_data_ipa_lto_finish_out,
	pass_ipa_lto_finish_out, make_pass_ipa_lto_finish_out): Remove.
	* lto-streamer.h (lto_output, produce_asm_for_decls): Declare.
	* pass-manager.h (GCC_PASS_LISTS, class pass_manager):
	Remove all_to_gen_passes.
	* passes.c (pass_manager::dump_passes): Remove its use.
	(pass_manager::register_pass): Likewise.
	(ipa_read_summaries, ipa_read_optimization_summaries): Likewise.
	(pass_manager::pass_manager): Don't initialize or use it.
	(write_lto): New static function.
	(ipa_write_summaries_1, ipa_write_optimization_summaries): Use it
	instead of using all_lto_gen_passes.
	* passes.def (all_to_gen_passes, pass_ipa_lto_gimple_out,
	pass_ipa_lto_finish_out): Delete.
	* tree-pass.h (make_pass_ipa_lto_gimple_out,
	make_pass_ipa_lto_finish_out): Don't declare.

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 8ab274b..e815be0 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2018,9 +2018,6 @@ ipa_passes (void)
   if (flag_generate_lto)
 targetm.asm_out.lto_start ();
 
-  execute_ipa_summary_passes ((struct ipa_opt_pass_d *)
-			  passes->all_lto_gen_passes);
-
   if (!in_lto_p)
 ipa_write_summaries ();
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 9a00230..5176e35 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1966,7 +1966,7 @@ copy_function (struct cgraph_node *node)
 
 /* Main entry point from the pass manager.  */
 
-static void
+void
 lto_output (void)
 {
   struct lto_out_decl_state *decl_state;
@@ -2016,53 +2016,6 @@ lto_output (void)
 #endif
 }
 
-namespace {
-
-const pass_data pass_data_ipa_lto_gimple_out =
-{
-  IPA_PASS, /* type */
-  "lto_gimple_out", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  true, /* has_gate */
-  false, /* has_execute */
-  TV_IPA_LTO_GIMPLE_OUT, /* tv_id */
-  0, /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_ipa_lto_gimple_out : public ipa_opt_pass_d
-{
-public:
-  pass_ipa_lto_gimple_out (gcc::context *ctxt)
-: ipa_opt_pass_d (pass_data_ipa_lto_gimple_out, ctxt,
-		  NULL, /* generate_summary */
-		  lto_output, /* write_summary */
-		  NULL, /* read_summary */
-		  lto_output, /* write_optimization_summary */
-		  NULL, /* read_optimization_summary */
-		  NULL, /* stmt_fixup */
-		  0, /* function_transform_todo_flags_start */
-		  NULL, /* function_transform */
-		  NULL) /* variable_transform */
-  {}
-
-  /* opt_pass methods: */
-  bool gate () { return gate_lto_out (); }
-
-}; // class pass_ipa_lto_gimple_out
-
-} // anon namespace
-
-ipa_opt_pass_d *
-make_pass_ipa_lto_gimple_out (gcc::context *ctxt)
-{
-  return new pass_ipa_lto_gimple_out (ctxt);
-}
-
-
 /* Write each node in encoded by ENCODER to OB, as well as those reachable
from it and required for correct representation of its semantics.
Each node in ENCODER must be a global declaration or a type.  A node
@@ -2382,7 +2335,7 @@ produce_symtab (struct output_block *ob)
this file to be written in to a section that can then be read in to
recover these on other side.  */
 
-static void
+void
 produce_asm_for_decls (void)
 {
   struct lto_out_decl_state *out_state;
@@ -2486,50 +2439,3 @@ produce_asm_for_decls (void)
   lto_function_decl_states.release ();
   destroy_output_block (ob);
 }
-
-
-namespace {
-
-const pass_data pass_data_ipa_lto_finish_out =
-{
-  IPA_PASS, /* type */
-  "lto_decls_out", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  true, /* has_gate */
-  false, /* has_execute */
-  TV_IPA_LTO_DECL_OUT, /* tv_id */
-  0, /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_fl

Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx patterns in IVOPT

2013-11-18 Thread Bin.Cheng
Ping^2

Thanks,
bin

On Tue, Nov 12, 2013 at 3:08 PM, bin.cheng  wrote:
> Ping in this one.
> Hi Bernd, could you please help us on this one?

> Sorry for the inconvenience.
>
> Thanks,
> bin
>
>> -Original Message-
>> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
>> Sent: Monday, November 04, 2013 8:56 PM
>> To: Richard Biener
>> Cc: Bin Cheng; Bernd Schmidt; GCC Patches
>> Subject: Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx
>> patterns in IVOPT
>>
>> On Mon, Nov 4, 2013 at 7:38 PM, Richard Biener
>>  wrote:
>> > On Mon, Nov 4, 2013 at 4:31 AM, bin.cheng  wrote:
>> >> Hi,
>> >>
>> >> The IVOPT in GCC has a problem that it does not use cost of
>> >> auto-increment address expression in accounting, while it retreats to
>> >> cost of address expression if auto-increment addressing mode is
>> unavailable.
>> >> For example, on ARM target:
>> >> 1) the cost of "[reg]" (which is 6) is used for address expression
>> >> "[reg], #off";
>> >> 2) the cost of "[reg+off]" (which is 2) is used for address
>> >> expression "[reg, #off]!";
>> >>
>> >> This causes:
>> >> 1) cost of non-auto increment address expression is used for
>> >> auto-increment address expression;
>> >> 2) different address costs are used for pre/post increment address
>> >> expressions.
>> >> This patch fixes the problem by computing, caching and using the cost
>> >> of auto-increment address expressions.
>> >>
>> >> Bootstrap and test on x86/arm.  Is it OK?
>> >
>> > But don't you need to adjust
>> >
>> > static bool
>> > determine_use_iv_cost_address (struct ivopts_data *data,
>> >struct iv_use *use, struct iv_cand
>> > *cand) {
>> >   bitmap depends_on;
>> >   bool can_autoinc;
>> >   int inv_expr_id = -1;
>> >   comp_cost cost = get_computation_cost (data, use, cand, true,
>> &depends_on,
>> >  &can_autoinc, &inv_expr_id);
>> >
>> >   if (cand->ainc_use == use)
>> > {
>> >   if (can_autoinc)
>> > cost.cost -= cand->cost_step;
>> >
>> > this which seems to try to compensate for your issue?
>> That's where problem gets complicated depending on how backend defines
>> address cost.  For back ends define cost according to the true cost of
>> addressing mode approximately, the address cost of auto-increment
>> addressing mode doesn't take the saved stepping instruction into
>> consideration, so the code is necessary.
>> Moreover, according to gcc internal's description about
>> TARGET_ADDRESS_COST, RISC machines may define different address cost
>> for addressing modes which actually have equal execution on micro-
>> architecture level (like ARM for now).  The problems are:
>> 1) Though address costs are defined in this "discriminated" way, it's
> unlikely
>> to have the saved stepping instruction considered either.
>> The address cost of auto-increment address expression shouldn't go so far.
>> 2) We think the "discriminated" address cost model is established before
>> gimple pass and is outdated.  The true micro-architecture address cost (or
>> cost normalized with COSTS_N_INSNS) should be used in GCC nowadays.
>> The rtl passes like fwprop_addr which use address cost as heuristic
>> information should be refined... but that's totally another problem (am
>> investigating it).
>>
>> So overall, I think the stepping cost should to be subtracted here.
>>
>> >
>> > Or maybe I don't understand.
>> >
>> > CCing Bernd who implemented this IIRC.
>> Any suggestions appreciated.
>>
>> Thanks.
>> bin
>>
>> --
>> Best Regards.
>
>
>
>



-- 
Best Regards.


Re: [PATCH] Time profiler - phase 2

2013-11-18 Thread Jan Hubicka
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 5cb07b7..754f882 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,13 @@
> +2013-11-17  Martin Liska  
> + Jan Hubicka  
> +
> + * cgraphunit.c (node_cmp): New function.
> + (expand_all_functions): Function ordering added.
> + * common.opt: New profile based function reordering flag introduced.
> + * lto-partition.c: Support for time profile added.
> + * lto.c: Likewise.
> + * predict.c (handle_missing_profiles): Time profile handled in
> +   missing profiles.

OK,
thanks!  Implementing the function section naming scheme would be easy and it 
would
enable us to do the reordering even w/o LTO that would be quite cool. Lets hope 
it gets
resolved soon.

Honza


[Ada] Remove redundant define

2013-11-18 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2013-11-18  Eric Botcazou  

* gcc-interface/trans.c (TARGET_ABI_OPEN_VMS): Delete as redundant.


-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 204943)
+++ gcc-interface/trans.c	(working copy)
@@ -66,12 +66,6 @@
instead.  */
 #define ALLOCA_THRESHOLD 1000
 
-/* Let code below know whether we are targeting VMS without need of
-   intrusive preprocessor directives.  */
-#ifndef TARGET_ABI_OPEN_VMS
-#define TARGET_ABI_OPEN_VMS 0
-#endif
-
 /* In configurations where blocks have no end_locus attached, just
sink assignments into a dummy global.  */
 #ifndef BLOCK_SOURCE_END_LOCATION


[PATCH, MPX, 2/X] Pointers Checker [13/25] Inline support

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to add instumented code support in inliner.  Main changes are:
  - Replace BUILT_IN_CHKP_ARG_BND calls in inlined code with actually passed 
bounds
  - Replace BUILT_IN_CHKP_BNDRET in caller with returned bounds
  - Add bounds copy for assignments generated by inliner

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* ipa-inline.c (early_inliner): Check edge has summary allocated.
* tree-inline.c: Include tree-chkp.h.
(declare_return_variable): Add arg holding
returned bounds slot.  Create and initialize returned bounds var.
(remap_gimple_stmt): Handle returned bounds.  Replace
BUILT_IN_CHKP_ARG_BND built-in call with actual arg bounds.
Return sequence of statements instead of a single statement.
(insert_init_stmt): Add declaration.
(remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
(copy_bb): Adjust to changed return type of remap_gimple_stmt.
(initialize_inlined_parameters): Initialize bounds of params.
(expand_call_inline): Handle returned bounds.  Add bounds copy
for generated mem to mem assignments.
* tree-inline.h (copy_body_data): Add fields retbnd and
assign_stmts.
* cgraph.c: Include tree-chkp.h.
(cgraph_redirect_edge_call_stmt_to_callee): Support
returned bounds.
* value-prof.c: Include tree-chkp.h.
(gimple_ic): Support returned bounds.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 2841055..549703d 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-inline.h"
 #include "cfgloop.h"
 #include "gimple-pretty-print.h"
+#include "tree-chkp.h"
 
 /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this.  
*/
 #include "tree-pass.h"
@@ -1380,6 +1381,28 @@ cgraph_redirect_edge_call_stmt_to_callee (struct 
cgraph_edge *e)
  e->speculative = false;
  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt,
 new_stmt, false);
+ if (gimple_call_with_bounds_p (new_stmt)
+ && gimple_call_lhs (new_stmt)
+ && chkp_retbnd_call_by_val (gimple_call_lhs (e2->call_stmt)))
+   {
+ tree dresult = gimple_call_lhs (new_stmt);
+ tree iresult = gimple_call_lhs (e2->call_stmt);
+ gimple dbndret = chkp_retbnd_call_by_val (dresult);
+ gimple ibndret = chkp_retbnd_call_by_val (iresult);
+ struct cgraph_edge *iedge = cgraph_edge (e2->caller, ibndret);
+ struct cgraph_edge *dedge;
+
+ if (dbndret)
+   {
+ dedge = cgraph_create_edge (iedge->caller, iedge->callee,
+ dbndret, e->count,
+ e->frequency);
+ dedge->frequency = compute_call_stmt_bb_frequency
+   (dedge->caller->decl, gimple_bb (dedge->call_stmt));
+   }
+ iedge->frequency = compute_call_stmt_bb_frequency
+   (iedge->caller->decl, gimple_bb (iedge->call_stmt));
+   }
  e->frequency = compute_call_stmt_bb_frequency
   (e->caller->decl, gimple_bb (e->call_stmt));
  e2->frequency = compute_call_stmt_bb_frequency
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 85f8e5d..ecf7572 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -2267,11 +2267,15 @@ early_inliner (void)
 info that might be cleared out for newly discovered edges.  */
  for (edge = node->callees; edge; edge = edge->next_callee)
{
- struct inline_edge_summary *es = inline_edge_summary (edge);
- es->call_stmt_size
-   = estimate_num_insns (edge->call_stmt, &eni_size_weights);
- es->call_stmt_time
-   = estimate_num_insns (edge->call_stmt, &eni_time_weights);
+ /* We have no summary for new bound store calls yet.  */
+ if (inline_edge_summary_vec.length () > (unsigned)edge->uid)
+   {
+ struct inline_edge_summary *es = inline_edge_summary (edge);
+ es->call_stmt_size
+   = estimate_num_insns (edge->call_stmt, &eni_size_weights);
+ es->call_stmt_time
+   = estimate_num_insns (edge->call_stmt, &eni_time_weights);
+   }
  if (edge->callee->decl
  && !gimple_check_call_matching_types (
  edge->call_stmt, edge->callee->decl, false))
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 1e1285a..d2f2337 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "target.h"
 #include "cfgloop.h"
+#include "tree-chkp.h"
 
 #include "rtl.h"   /

[Ping, avr] Emit diagnostics if -f{pic,PIC,pie,PIE} or -shared is passed

2013-11-18 Thread Senthil Kumar Selvaraj
Ping!

Regards
Senthil

On Mon, Nov 04, 2013 at 06:45:19PM +0530, Senthil Kumar Selvaraj wrote:
> The AVR backend does not generate position independent code, yet it
> happily accepts -fpic, -fPIC, -fpie and -fPIE. The generated code
> doesn't change at all. Also, it accepts the -shared option to generate a
> shared library, without really doing anything with it.
> 
> This causes one of the regression tests 
> (gcc.dg/lto/pr54709 c_lto_pr54709_0.o-c_lto_pr54709_1.o link) to fail with 
> an 'undefined reference to main' error, when the test is trying to build 
> a shared object.
> 
> The attached patch generates a warning if one of the -f{pic,PIC,pie,PIE} 
> options is provided, and an error if -shared is provided (
> config/mep/mep.c and config/s390/tpf.h already do something very similar).
> 
> Regression tested with no new failures.Tests which exercise PIC now report as 
> unsupported.
> 
> If ok, could someone commit please? I don't have commit access.
> 
> Regards
> Senthil
> 
> gcc/ChangeLog
> 2013-11-04  Senthil Kumar Selvaraj  
> 
>   * config/avr/avr.c (avr_option_override): Warn if asked to generate
>   position independent code.
>   * config/avr/avr.h: Modify LINK_SPEC to reject -shared.
> 
> 
> diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c
> index e7e1c2f..cf4b8ca 100644
> --- gcc/config/avr/avr.c
> +++ gcc/config/avr/avr.c
> @@ -310,6 +310,15 @@ avr_option_override (void)
>flag_omit_frame_pointer = 0;
>  }
>  
> +  if (flag_pic == 1)
> +warning (OPT_fpic, "-fpic is not supported");
> +  if (flag_pic == 2)
> +warning (OPT_fPIC, "-fPIC is not supported");
> +  if (flag_pie == 1)
> +warning (OPT_fpie, "-fpie is not supported");
> +  if (flag_pie == 2)
> +warning (OPT_fPIE, "-fPIE is not supported");
> +
>avr_current_device = &avr_mcu_types[avr_mcu_index];
>avr_current_arch = &avr_arch_types[avr_current_device->arch];
>  
> diff --git gcc/config/avr/avr.h gcc/config/avr/avr.h
> index f223a61..1eff5be 100644
> --- gcc/config/avr/avr.h
> +++ gcc/config/avr/avr.h
> @@ -522,7 +522,8 @@ extern const char *avr_device_to_sp8 (int argc, const 
> char **argv);
> mmcu=at90can64*|\
> mmcu=at90usb64*:--pmem-wrap-around=64k}}}\
>  %:device_to_ld(%{mmcu=*:%*})\
> -%:device_to_data_start(%{mmcu=*:%*})"
> +%:device_to_data_start(%{mmcu=*:%*})\
> +%{shared:%eshared is not supported}"
>  
>  #define LIB_SPEC \
>
> "%{!mmcu=at90s1*:%{!mmcu=attiny11:%{!mmcu=attiny12:%{!mmcu=attiny15:%{!mmcu=attiny28:
>  -lc }"


[PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to disable splitting when bounds transfer is required for 
splitted function.

Thanks,
Ilya
--
2013-11-13  Ilya Enkovich  

* ipa-split.c: Include tree-chkp.h.
(consider_split): Do not split when splitted
part needs bounds transfer.


diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index e55b3f5..1c3df61 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -101,6 +101,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "ipa-inline.h"
 #include "cfgloop.h"
+#include "tree-chkp.h"
 
 /* Per basic block info.  */
 
@@ -378,6 +379,7 @@ consider_split (struct split_point *current, bitmap 
non_ssa_vars,
   int incoming_freq = 0;
   tree retval;
   bool back_edge = false;
+  bitmap_iterator bi;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 dump_split_point (dump_file, current);
@@ -486,6 +488,28 @@ consider_split (struct split_point *current, bitmap 
non_ssa_vars,
   if (!VOID_TYPE_P (TREE_TYPE (current_function_decl)))
 call_overhead += estimate_move_cost (TREE_TYPE (current_function_decl));
 
+  /* Currently bounds passing and return is not supported for
+ splitted functions.  */
+  EXECUTE_IF_SET_IN_BITMAP (current->ssa_names_to_pass, 0, i, bi)
+{
+  if (POINTER_BOUNDS_P (ssa_name (i)))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"  Refused: need to pass bounds\n");
+ return;
+   }
+}
+
+  if (chkp_function_instrumented_p (current_function_decl)
+  && chkp_type_has_pointer (TREE_TYPE (current_function_decl)))
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"  Refused: need to return bounds\n");
+  return;
+}
+
   if (current->split_size <= call_overhead)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))


[PATCH, MPX, 2/X] Pointers Checker [15/25] IPA Propagation

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to disable propagation of bounded values.

Thanks,
Ilya
--
2013-11-13  Ilya Enkovich  

* ipa-prop.c: Include tree-chkp.h.
(ipa_compute_jump_functions_for_edge): Do not propagate bounded args.


diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index eb464e4..81e1237 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-streamer.h"
 #include "params.h"
 #include "ipa-utils.h"
+#include "tree-chkp.h"
 
 /* Intermediate information about a parameter that is only useful during the
run of ipa_analyze_node and is not kept afterwards.  */
@@ -1558,6 +1559,7 @@ ipa_compute_jump_functions_for_edge (struct 
param_analysis_info *parms_ainfo,
   struct ipa_node_params *info = IPA_NODE_REF (cs->caller);
   struct ipa_edge_args *args = IPA_EDGE_REF (cs);
   gimple call = cs->call_stmt;
+  tree fndecl = gimple_call_fndecl (call);
   int n, arg_num = gimple_call_num_args (call);
 
   if (arg_num == 0 || args->jump_functions)
@@ -1575,7 +1577,13 @@ ipa_compute_jump_functions_for_edge (struct 
param_analysis_info *parms_ainfo,
   tree arg = gimple_call_arg (call, n);
   tree param_type = ipa_get_callee_param_type (cs, n);
 
-  if (is_gimple_ip_invariant (arg))
+  /* No optimization for bounded types yet implemented.  */
+  if ((gimple_call_with_bounds_p (call)
+  || (fndecl && chkp_function_instrumented_p (fndecl)))
+ && ((param_type && chkp_type_has_pointer (param_type))
+ || (!param_type && chkp_type_has_pointer (TREE_TYPE (arg)
+   continue;
+  else if (is_gimple_ip_invariant (arg))
ipa_set_jf_constant (jfunc, arg, cs);
   else if (!is_gimple_reg_type (TREE_TYPE (arg))
   && TREE_CODE (arg) == PARM_DECL)


RE: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-11-18 Thread Bernd Edlinger
Hi,

On Fri, 15 Nov 2013 13:30:51, Richard Biener wrote:
>> That looks like always pretending it is a bit field.
>> But it is not a bit field, and bitregion_start=bitregion_end=0
>> means it is an ordinary value.
>
> I don't think it is supposed to mean that. It's supposed to mean
> "the access is unconstrained".
>

Ok, agreed, I did not regard that as a feature.
And apparently only the code path in expand_assigment
really has a problem with it.

So here my second attempt at fixing this.

Boot-strapped and regression-tested on x86_64-linux-gnu.

OK for trunk?


Thanks
Bernd.2013-11-18  Bernd Edlinger  

Fix C++0x memory model for unaligned fields in packed, aligned(4)
structures with -fno-strict-volatile-bitfields on STRICT_ALIGNMENT
targets like arm-none-eabi.
* expr.c (expand_assignment): Handle normal fields like bit regions.

testsuite:
2013-11-18  Bernd Edlinger  

* gcc.dg/pr56997-4.c: New testcase.



patch-unaligned-data.diff
Description: Binary data


[PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to disable tail recursion transformation when bounds are passed 
by call.  The reason is BUILT_IN_CHKP_ARG_BND which should always get default 
SSA_NAME of PARM_DECL as an argument.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* tree-tailcall.c: Include tree-chkp.h.
(suitable_for_tail_opt_p): Disable tail
recursion for instrumented functions with bounded args.


diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index 185bf16..59845950 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "common/common-target.h"
 #include "ipa-utils.h"
+#include "tree-chkp.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -141,6 +142,20 @@ suitable_for_tail_opt_p (void)
   if (cfun->stdarg)
 return false;
 
+  /* Tail recursion elimination may cause arg_bnd builtins to be called
+ not for PARM_DECL which is not allowed now.  Avoid optimization
+ in such cases for now.  */
+  if (chkp_function_instrumented_p (current_function_decl))
+{
+  tree param;
+
+  for (param = DECL_ARGUMENTS (current_function_decl);
+  param;
+  param = DECL_CHAIN (param))
+   if (BOUNDED_P (param))
+ return false;
+}
+
   return true;
 }
 /* Returns false when the function is not suitable for tail call optimization


[PATCH, MPX, 2/X] Pointers Checker [17/25] IPA SRA

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch remove bounded params from candidates for reduction.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* tree-sra.c: Include tree-chkp.h.
(find_param_candidates): Bounded param is not
a candidate.


diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index d97af13..eec2b6b 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -102,6 +102,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-chkp.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -3667,7 +3668,9 @@ find_param_candidates (void)
 
   if (TREE_THIS_VOLATILE (parm)
  || TREE_ADDRESSABLE (parm)
- || (!is_gimple_reg_type (type) && is_va_list_type (type)))
+ || (!is_gimple_reg_type (type) && is_va_list_type (type))
+ || (chkp_function_instrumented_p (current_function_decl)
+ && chkp_type_has_pointer (type)))
continue;
 
   if (is_unused_scalar_param (parm))


Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 1:55 PM, Richard Sandiford
 wrote:
> Add tree_fits_shwi_p and tree_fits_uhwi_p.  The implementations are taken
> directly from host_integerp.

tree_ is a bit generic - you only ever return true for INTEGER_CSTs,
so please use int_fits_[su]hwi_p please (mimicing int_fits_type_p).

Ok with that change.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree.h (tree_fits_shwi_p, tree_fits_uhwi_p): Declare.
> * tree.c (tree_fits_shwi_p, tree_fits_uhwi_p): Define.
>
> Index: gcc/tree.h
> ===
> --- gcc/tree.h  2013-11-16 09:09:56.388037088 +
> +++ gcc/tree.h  2013-11-16 09:11:53.535874667 +
> @@ -3659,6 +3659,16 @@ extern int host_integerp (const_tree, in
>ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled.  */
>  #endif
>;
> +extern bool tree_fits_shwi_p (const_tree)
> +#ifndef ENABLE_TREE_CHECKING
> +  ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. 
>  */
> +#endif
> +  ;
> +extern bool tree_fits_uhwi_p (const_tree)
> +#ifndef ENABLE_TREE_CHECKING
> +  ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. 
>  */
> +#endif
> +  ;
>  extern HOST_WIDE_INT tree_low_cst (const_tree, int);
>  #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
>  extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
> Index: gcc/tree.c
> ===
> --- gcc/tree.c  2013-11-16 09:09:56.388037088 +
> +++ gcc/tree.c  2013-11-16 09:11:53.534874659 +
> @@ -6990,6 +6990,32 @@ host_integerp (const_tree t, int pos)
>   || (pos && TREE_INT_CST_HIGH (t) == 0)));
>  }
>
> +/* Return true if T is an INTEGER_CST whose numerical value (extended
> +   according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  */
> +
> +bool
> +tree_fits_shwi_p (const_tree t)
> +{
> +  return (t != NULL_TREE
> + && TREE_CODE (t) == INTEGER_CST
> + && ((TREE_INT_CST_HIGH (t) == 0
> +  && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0)
> + || (TREE_INT_CST_HIGH (t) == -1
> + && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0
> + && !TYPE_UNSIGNED (TREE_TYPE (t);
> +}
> +
> +/* Return true if T is an INTEGER_CST whose numerical value (extended
> +   according to TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT.  */
> +
> +bool
> +tree_fits_uhwi_p (const_tree t)
> +{
> +  return (t != NULL_TREE
> + && TREE_CODE (t) == INTEGER_CST
> + && TREE_INT_CST_HIGH (t) == 0);
> +}
> +
>  /* Return the HOST_WIDE_INT least significant bits of T if it is an
> INTEGER_CST and there is no overflow.  POS is nonzero if the result must
> be non-negative.  We must be able to satisfy the above conditions.  */


Re: [5/10] Add tree_to_shwi and tree_to_uhwi

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:03 PM, Richard Sandiford
 wrote:
> Add tree_to_shwi and tree_to_uhwi.  Initially tree_to_uhwi returns a
> HOST_WIDE_INT, so that it's a direct replacement for tree_low_cst.
> Patch 10 makes it return unsigned HOST_WIDE_INT instead.

Possibly same naming issue (though we already do have the weird
int_cst_value) - int_to_[us]hwi ()?

OTOH if somebody else is fine with using tree_ for the four functions
then call it a day.

Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree.h (tree_to_shwi, tree_to_uhwi): Declare, with inline 
> expansions.
> * tree.c (tree_to_shwi, tree_to_uhwi): New functions.
>
> Index: gcc/tree.c
> ===
> --- gcc/tree.c  2013-11-15 16:46:27.420395607 +
> +++ gcc/tree.c  2013-11-15 16:47:15.226216885 +
> @@ -7027,6 +7027,28 @@ tree_low_cst (const_tree t, int pos)
>return TREE_INT_CST_LOW (t);
>  }
>
> +/* T is an INTEGER_CST whose numerical value (extended according to
> +   TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  Return that
> +   HOST_WIDE_INT.  */
> +
> +HOST_WIDE_INT
> +tree_to_shwi (const_tree t)
> +{
> +  gcc_assert (tree_fits_shwi_p (t));
> +  return TREE_INT_CST_LOW (t);
> +}
> +
> +/* T is an INTEGER_CST whose numerical value (extended according to
> +   TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT.  Return that
> +   HOST_WIDE_INT.  */
> +
> +HOST_WIDE_INT
> +tree_to_uhwi (const_tree t)
> +{
> +  gcc_assert (tree_fits_uhwi_p (t));
> +  return TREE_INT_CST_LOW (t);
> +}
> +
>  /* Return the most significant (sign) bit of T.  */
>
>  int
> Index: gcc/tree.h
> ===
> --- gcc/tree.h  2013-11-15 16:46:26.263399881 +
> +++ gcc/tree.h  2013-11-15 16:46:56.569287095 +
> @@ -3662,6 +3662,8 @@ extern bool tree_fits_uhwi_p (const_tree
>  #endif
>;
>  extern HOST_WIDE_INT tree_low_cst (const_tree, int);
> +extern HOST_WIDE_INT tree_to_shwi (const_tree);
> +extern HOST_WIDE_INT tree_to_uhwi (const_tree);
>  #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
>  extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
>  tree_low_cst (const_tree t, int pos)
> @@ -3669,6 +3671,20 @@ tree_low_cst (const_tree t, int pos)
>gcc_assert (host_integerp (t, pos));
>return TREE_INT_CST_LOW (t);
>  }
> +
> +extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
> +tree_to_shwi (const_tree t)
> +{
> +  gcc_assert (tree_fits_shwi_p (t));
> +  return TREE_INT_CST_LOW (t);
> +}
> +
> +extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
> +tree_to_uhwi (const_tree t)
> +{
> +  gcc_assert (tree_fits_uhwi_p (t));
> +  return TREE_INT_CST_LOW (t);
> +}
>  #endif
>  extern int tree_int_cst_sgn (const_tree);
>  extern int tree_int_cst_sign_bit (const_tree);


Re: [6/10] Mechanical replacement of tree_low_cst (..., 0)

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:05 PM, Richard Sandiford
 wrote:
> Like patch 2, but using sed to replace "tree_low_cst (x, 0)" with
> "tree_to_shwi (x)".

Ok with whatever the naming bikeshedding arrives at.

Richard.

> Thanks,
> Richard
>
>
> gcc/c-family/
> * c-common.c, c-format.c, c-omp.c, c-pretty-print.c: Replace
> tree_low_cst (..., 0) with tree_to_shwi throughout.
>
> gcc/c/
> * c-parser.c: Replace tree_low_cst (..., 0) with tree_to_shwi
> throughout.
>
> gcc/cp/
> * class.c, dump.c, error.c, init.c, method.c, parser.c, semantics.c:
> Replace tree_low_cst (..., 0) with tree_to_shwi throughout.
>
> gcc/go/
> * gofrontend/expressions.cc: Replace tree_low_cst (..., 0) with
> tree_to_shwi throughout.
>
> gcc/java/
> * class.c, expr.c: Replace tree_low_cst (..., 0) with tree_to_shwi
> throughout.
>
> gcc/objc/
> * objc-next-runtime-abi-02.c: Replace tree_low_cst (..., 0) with
> tree_to_shwi throughout.
>
> gcc/
> * builtins.c, cilk-common.c, config/aarch64/aarch64.c,
> config/alpha/alpha.c, config/arm/arm.c, config/c6x/predicates.md,
> config/i386/i386.c, config/ia64/predicates.md, config/s390/s390.c,
> coverage.c, dbxout.c, dwarf2out.c, except.c, explow.c, expr.c, expr.h,
> fold-const.c, gimple-fold.c, godump.c, ipa-prop.c, omp-low.c,
> predict.c, rtlanal.c, sdbout.c, stmt.c, stor-layout.c, targhooks.c,
> tree-cfg.c, tree-data-ref.c, tree-inline.c, tree-ssa-forwprop.c,
> tree-ssa-loop-prefetch.c, tree-ssa-phiopt.c, tree-ssa-sccvn.c,
> tree-ssa-strlen.c, tree-stdarg.c, tree-vect-data-refs.c,
> tree-vect-patterns.c, tree.c, tree.h, var-tracking.c, varasm.c:
> Replace tree_low_cst (..., 0) with tree_to_shwi throughout.
>


Re: [7/10] Mechanical replacement of tree_low_cst (..., 1)

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:06 PM, Richard Sandiford
 wrote:
> Like the previous patch, but for tree_low_cst (x, 1) -> tree_to_uhwi_p (x).

Same ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/ada/
> * gcc-interface/decl.c, gcc-interface/utils.c, gcc-interface/utils2.c:
> Replace tree_low_cst (..., 1) with tree_to_uhwi throughout.
>
> gcc/c-family/
> * c-common.c, c-cppbuiltin.c: Replace tree_low_cst (..., 1) with
> tree_to_uhwi throughout.
>
> gcc/c/
> * c-decl.c, c-typeck.c: Replace tree_low_cst (..., 1) with
> tree_to_uhwi throughout.
>
> gcc/cp/
> * call.c, class.c, decl.c, error.c: Replace tree_low_cst (..., 1) with
> tree_to_uhwi throughout.
>
> gcc/objc/
> * objc-encoding.c: Replace tree_low_cst (..., 1) with tree_to_uhwi
> throughout.
>
> gcc/
> * alias.c, asan.c, builtins.c, cfgexpand.c, cgraph.c,
> config/aarch64/aarch64.c, config/alpha/predicates.md,
> config/arm/arm.c, config/darwin.c, config/epiphany/epiphany.c,
> config/i386/i386.c, config/iq2000/iq2000.c, config/m32c/m32c-pragma.c,
> config/mep/mep-pragma.c, config/mips/mips.c,
> config/picochip/picochip.c, config/rs6000/rs6000.c, cppbuiltin.c,
> dbxout.c, dwarf2out.c, emit-rtl.c, except.c, expr.c, fold-const.c,
> function.c, gimple-fold.c, godump.c, ipa-cp.c, ipa-prop.c, omp-low.c,
> predict.c, sdbout.c, stor-layout.c, trans-mem.c, tree-object-size.c,
> tree-sra.c, tree-ssa-ccp.c, tree-ssa-forwprop.c,
> tree-ssa-loop-ivcanon.c, tree-ssa-loop-ivopts.c, 
> tree-ssa-loop-niter.c,
> tree-ssa-loop-prefetch.c, tree-ssa-strlen.c, tree-stdarg.c,
> tree-switch-conversion.c, tree-vect-generic.c, tree-vect-loop.c,
> tree-vect-patterns.c, tree-vrp.c, tree.c, tsan.c, ubsan.c, varasm.c:
> Replace tree_low_cst (..., 1) with tree_to_uhwi throughout.
>


Re: [9/10] Remove host_integerp and tree_low_cst

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:12 PM, Richard Sandiford
 wrote:
> Remove the old functions, which are now unused.

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree.h (host_integerp, tree_low_cst): Delete.
> * tree.c (host_integerp, tree_low_cst): Delete.
>
> Index: gcc/tree.h
> ===
> --- gcc/tree.h  2013-11-16 09:35:59.381239766 +
> +++ gcc/tree.h  2013-11-16 10:14:00.618868694 +
> @@ -3654,11 +3654,6 @@ extern int attribute_list_contained (con
>  extern int tree_int_cst_equal (const_tree, const_tree);
>  extern int tree_int_cst_lt (const_tree, const_tree);
>  extern int tree_int_cst_compare (const_tree, const_tree);
> -extern int host_integerp (const_tree, int)
> -#ifndef ENABLE_TREE_CHECKING
> -  ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled.  */
> -#endif
> -  ;
>  extern bool tree_fits_shwi_p (const_tree)
>  #ifndef ENABLE_TREE_CHECKING
>ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. 
>  */
> @@ -3669,18 +3664,10 @@ extern bool tree_fits_uhwi_p (const_tree
>ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. 
>  */
>  #endif
>;
> -extern HOST_WIDE_INT tree_low_cst (const_tree, int);
>  extern HOST_WIDE_INT tree_to_shwi (const_tree);
>  extern HOST_WIDE_INT tree_to_uhwi (const_tree);
>  #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
>  extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
> -tree_low_cst (const_tree t, int pos)
> -{
> -  gcc_assert (host_integerp (t, pos));
> -  return TREE_INT_CST_LOW (t);
> -}
> -
> -extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
>  tree_to_shwi (const_tree t)
>  {
>gcc_assert (tree_fits_shwi_p (t));
> Index: gcc/tree.c
> ===
> --- gcc/tree.c  2013-11-16 09:59:37.205620348 +
> +++ gcc/tree.c  2013-11-16 10:14:00.604868554 +
> @@ -6970,26 +6970,6 @@ tree_int_cst_compare (const_tree t1, con
>  return 0;
>  }
>
> -/* Return 1 if T is an INTEGER_CST that can be manipulated efficiently on
> -   the host.  If POS is zero, the value can be represented in a single
> -   HOST_WIDE_INT.  If POS is nonzero, the value must be non-negative and can
> -   be represented in a single unsigned HOST_WIDE_INT.  */
> -
> -int
> -host_integerp (const_tree t, int pos)
> -{
> -  if (t == NULL_TREE)
> -return 0;
> -
> -  return (TREE_CODE (t) == INTEGER_CST
> - && ((TREE_INT_CST_HIGH (t) == 0
> -  && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0)
> - || (! pos && TREE_INT_CST_HIGH (t) == -1
> - && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0
> - && !TYPE_UNSIGNED (TREE_TYPE (t)))
> - || (pos && TREE_INT_CST_HIGH (t) == 0)));
> -}
> -
>  /* Return true if T is an INTEGER_CST whose numerical value (extended
> according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  */
>
> @@ -7016,17 +6996,6 @@ tree_fits_uhwi_p (const_tree t)
>   && TREE_INT_CST_HIGH (t) == 0);
>  }
>
> -/* Return the HOST_WIDE_INT least significant bits of T if it is an
> -   INTEGER_CST and there is no overflow.  POS is nonzero if the result must
> -   be non-negative.  We must be able to satisfy the above conditions.  */
> -
> -HOST_WIDE_INT
> -tree_low_cst (const_tree t, int pos)
> -{
> -  gcc_assert (host_integerp (t, pos));
> -  return TREE_INT_CST_LOW (t);
> -}
> -
>  /* T is an INTEGER_CST whose numerical value (extended according to
> TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  Return that
> HOST_WIDE_INT.  */


[PATCH, MPX, 2/X] Pointers Checker [18/25] CCP (Stack store/restore)

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to support BUILT_IN_CHKP_BNDRET and BUILT_IN_CHKP_BIND_BOUNDS 
in BUILT_IN_STACK_SAVE result uses.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* tree-ssa-ccp.c (insert_clobber_before_stack_restore): Handle
BUILT_IN_CHKP_BNDRET and BUILT_IN_CHKP_BIND_BOUNDS calls.

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 50006ab..cfba927 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -1774,7 +1774,7 @@ insert_clobber_before_stack_restore (tree saved_val, tree 
var,
 gimple_htab *visited)
 {
   gimple stmt, clobber_stmt;
-  tree clobber;
+  tree clobber, fndecl;
   imm_use_iterator iter;
   gimple_stmt_iterator i;
   gimple *slot;
@@ -1806,6 +1806,13 @@ insert_clobber_before_stack_restore (tree saved_val, 
tree var,
 else if (gimple_assign_ssa_name_copy_p (stmt))
   insert_clobber_before_stack_restore (gimple_assign_lhs (stmt), var,
   visited);
+else if (gimple_call_builtin_p (stmt, BUILT_IN_CHKP_BIND_BOUNDS))
+  insert_clobber_before_stack_restore (gimple_call_lhs (stmt), var,
+  visited);
+else if (gimple_code (stmt) == GIMPLE_CALL
+&& (fndecl = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET))
+&& gimple_call_fndecl (stmt) == fndecl)
+  continue;
 else
   gcc_assert (is_gimple_debug (stmt));
 }


Re: [8/10] Mop up remaining tree_low_cst calls

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:10 PM, Richard Sandiford
 wrote:
> Handle tree_low_cst references that weren't caught by the sed.

Same ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/ada/
> * gcc-interface/cuintp.c (UI_From_gnu): Use tree_to_shwi rather than
> tree_low_cst.
>
> gcc/c-family/
> * c-common.c (fold_offsetof_1): Use tree_to_uhwi rather than
> tree_low_cst.
> (complete_array_type): Update comment to refer to tree_to_[su]hwi
> rather than tree_low_cst.
>
> gcc/c/
> * c-decl.c (grokdeclarator): Update comment to refer to
> tree_to_[su]hwi rather than tree_low_cst.
>
> gcc/cp/
> * decl.c (reshape_init_array_1): Use tree_to_uhwi rather than
> tree_low_cst.
> (grokdeclarator): Update comment to refer to tree_to_[su]hwi rather
> than tree_low_cst.
>
> gcc/
> * expr.h: Update comments to refer to tree_to_[su]hwi rather
> than tree_low_cst.
> * fold-const.c (fold_binary_loc): Likewise.
> * expr.c (store_constructor): Use tree_to_uhwi rather than
> tree_low_cst.
> * ipa-utils.h (possible_polymorphic_call_target_p): Likewise.
> * stmt.c (emit_case_dispatch_table): Likewise.
> * tree-switch-conversion.c (emit_case_bit_tests): Likewise.
>
> Index: gcc/ada/gcc-interface/cuintp.c
> ===
> --- gcc/ada/gcc-interface/cuintp.c  2013-11-16 13:08:22.531824320 +
> +++ gcc/ada/gcc-interface/cuintp.c  2013-11-16 13:08:24.254837390 +
> @@ -176,9 +176,9 @@ UI_From_gnu (tree Input)
>
>for (i = Max_For_Dint - 1; i >= 0; i--)
>  {
> -  v[i] = tree_low_cst (fold_build1 (ABS_EXPR, gnu_type,
> +  v[i] = tree_to_shwi (fold_build1 (ABS_EXPR, gnu_type,
> fold_build2 (TRUNC_MOD_EXPR, gnu_type,
> -gnu_temp, gnu_base)), 0);
> +gnu_temp, gnu_base)));
>gnu_temp = fold_build2 (TRUNC_DIV_EXPR, gnu_type, gnu_temp, gnu_base);
>  }
>
> Index: gcc/c-family/c-common.c
> ===
> --- gcc/c-family/c-common.c 2013-11-16 13:08:22.531824320 +
> +++ gcc/c-family/c-common.c 2013-11-16 13:08:46.45771 +
> @@ -9721,8 +9721,7 @@ fold_offsetof_1 (tree expr)
>   return error_mark_node;
> }
>off = size_binop_loc (input_location, PLUS_EXPR, DECL_FIELD_OFFSET (t),
> -   size_int (tree_low_cst (DECL_FIELD_BIT_OFFSET (t),
> -   1)
> +   size_int (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (t))
>   / BITS_PER_UNIT));
>break;
>
> @@ -10091,7 +10090,7 @@ complete_array_type (tree *ptype, tree i
>  {
>error ("size of array is too large");
>/* If we proceed with the array type as it is, we'll eventually
> -crash in tree_low_cst().  */
> +crash in tree_to_[su]hwi().  */
>type = error_mark_node;
>  }
>
> Index: gcc/c/c-decl.c
> ===
> --- gcc/c/c-decl.c  2013-11-16 13:08:22.531824320 +
> +++ gcc/c/c-decl.c  2013-11-16 13:08:24.258837421 +
> @@ -5912,7 +5912,7 @@ grokdeclarator (const struct c_declarato
>else
> error_at (loc, "size of unnamed array is too large");
>/* If we proceed with the array type as it is, we'll eventually
> -crash in tree_low_cst().  */
> +crash in tree_to_[su]hwi().  */
>type = error_mark_node;
>  }
>
> Index: gcc/cp/decl.c
> ===
> --- gcc/cp/decl.c   2013-11-16 13:08:22.531824320 +
> +++ gcc/cp/decl.c   2013-11-16 13:09:31.845353189 +
> @@ -5095,8 +5095,7 @@ reshape_init_array_1 (tree elt_type, tre
> max_index_cst = tree_to_uhwi (max_index);
>/* sizetype is sign extended, not zero extended.  */
>else
> -   max_index_cst = tree_low_cst (fold_convert (size_type_node, 
> max_index),
> - 1);
> +   max_index_cst = tree_to_uhwi (fold_convert (size_type_node, 
> max_index));
>  }
>
>/* Loop until there are no more initializers.  */
> @@ -10031,7 +10030,7 @@ grokdeclarator (const cp_declarator *dec
>  {
>error ("size of array %qs is too large", name);
>/* If we proceed with the array type as it is, we'll eventually
> -crash in tree_low_cst().  */
> +crash in tree_to_[su]hwi().  */
>type = error_mark_node;
>  }
>
> Index: gcc/expr.h
> ===
> --- gcc/expr.h  2013-11-16 13:08:22.531824320 +
> +++ gcc/expr.h  2013-11-16 13:08:24.263837459 +
> @@ -26,8 +26,8 @@ #define GC

Re: [10/10] Make tree_to_uhwi return unsigned

2013-11-18 Thread Richard Biener
On Sat, Nov 16, 2013 at 2:25 PM, Richard Sandiford
 wrote:
> This is probably the only non-obvious part of the series.  I went through
> all callers to tree_to_uhwi to see whether they were used in a context
> where signedness mattered.  If so, I tried to adjust the casting to match.
>
> This mostly meant removing casts to unsigned types.  There are a couple
> of cases where I added casts to HOST_WIDE_INT though, to mimic the old
> tree_low_cst behaviour:
>
> - In cfgexpand.c and trans-mem.c, where we're comparing the value
>   with an int PARAM_VALUE.  The test isn't watertight since any
>   unsigned constant > HOST_WIDE_INT_MAX is going to be accepted.
>   That's a preexisting problem though and it can be fixed more
>   easily with wi:: routines.  Until then this preserves the current
>   behaviour.
>
> - In the AArch32/64 and powerpc ABI handling.  Here too "count"
>   is an int and is probably not safe for large values anyway; e.g.:
>
> count *= (1 + tree_to_uhwi (TYPE_MAX_VALUE (index))
>   - tree_to_uhwi (TYPE_MIN_VALUE (index)));
>
>   is done without overflow checking.  This too is easier to fix
>   with wi::, so I've just kept it as a signed comparison for now.

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/c-family/
> * c-common.c (convert_vector_to_pointer_for_subscript): Remove
> cast to unsigned type.
>
> gcc/
> * tree.h (tree_to_uhwi): Return an unsigned HOST_WIDE_INT.
> * tree.c (tree_to_uhwi): Return an unsigned HOST_WIDE_INT.
> (tree_ctz): Remove cast to unsigned type.
> * builtins.c (fold_builtin_memory_op): Likewise.
> * dwarf2out.c (descr_info_loc): Likewise.
> * godump.c (go_output_typedef): Likewise.
> * omp-low.c (expand_omp_simd): Likewise.
> * stor-layout.c (excess_unit_span): Likewise.
> * tree-object-size.c (addr_object_size): Likewise.
> * tree-sra.c (analyze_all_variable_accesses): Likewise.
> * tree-ssa-forwprop.c (simplify_builtin_call): Likewise.
> (simplify_rotate): Likewise.
> * tree-ssa-strlen.c (adjust_last_stmt, handle_builtin_memcpy)
> (handle_pointer_plus): Likewise.
> * tree-switch-conversion.c (check_range): Likewise.
> * tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise.
> * tsan.c (instrument_builtin_call): Likewise.
> * cfgexpand.c (defer_stack_allocation): Add cast to HOST_WIDE_INT.
> * trans-mem.c (tm_log_add): Likewise.
> * config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise.
> * config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise.
> * config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise.
> * config/mips/mips.c (r10k_safe_mem_expr_p): Make offset unsigned.
>
> Index: gcc/c-family/c-common.c
> ===
> --- gcc/c-family/c-common.c 2013-11-16 10:13:53.825800713 +
> +++ gcc/c-family/c-common.c 2013-11-16 10:14:40.373263297 +
> @@ -11702,8 +11702,7 @@ convert_vector_to_pointer_for_subscript
>
>if (TREE_CODE (index) == INTEGER_CST)
>  if (!tree_fits_uhwi_p (index)
> -|| ((unsigned HOST_WIDE_INT) tree_to_uhwi (index)
> -   >= TYPE_VECTOR_SUBPARTS (type)))
> +|| tree_to_uhwi (index) >= TYPE_VECTOR_SUBPARTS (type))
>warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
>
>c_common_mark_addressable_vec (*vecp);
> Index: gcc/tree.h
> ===
> --- gcc/tree.h  2013-11-16 10:14:00.618868694 +
> +++ gcc/tree.h  2013-11-16 10:14:40.488264431 +
> @@ -3665,7 +3665,7 @@ extern bool tree_fits_uhwi_p (const_tree
>  #endif
>;
>  extern HOST_WIDE_INT tree_to_shwi (const_tree);
> -extern HOST_WIDE_INT tree_to_uhwi (const_tree);
> +extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree);
>  #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
>  extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
>  tree_to_shwi (const_tree t)
> @@ -3674,7 +3674,7 @@ tree_to_shwi (const_tree t)
>return TREE_INT_CST_LOW (t);
>  }
>
> -extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
> +extern inline __attribute__ ((__gnu_inline__)) unsigned HOST_WIDE_INT
>  tree_to_uhwi (const_tree t)
>  {
>gcc_assert (tree_fits_uhwi_p (t));
> Index: gcc/tree.c
> ===
> --- gcc/tree.c  2013-11-16 10:14:00.604868554 +
> +++ gcc/tree.c  2013-11-16 10:14:40.488264431 +
> @@ -2211,8 +2211,7 @@ tree_ctz (const_tree expr)
>  case LSHIFT_EXPR:
>ret1 = tree_ctz (TREE_OPERAND (expr, 0));
>if (tree_fits_uhwi_p (TREE_OPERAND (expr, 1))
> - && ((unsigned HOST_WIDE_INT) tree_to_uhwi (TREE_OPERAND (expr, 1))
> - < (unsigned HOST_WIDE_INT) prec))
> + && (tree_to_uhwi (TREE_OPERAND (expr, 1)) <

[PATCH, MPX, 2/X] Pointers Checker [19/25] Strchr

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to replace BUILT_IN_CHKP_BNDRET with assignment when optimizing 
strchr.

Thanks,
Ilya
--
2013-11-13  Ilya Enkovich  

* tree-ssa-strlen.c: Include tree-chkp.h.
(handle_builtin_strchr): Remove retbnd call
when strchr call is removed.


diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index cfd7b00..711f5d7 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "params.h"
 #include "expr.h"
+#include "tree-chkp.h"
 
 /* A vector indexed by SSA_NAME_VERSION.  0 means unknown, positive value
is an index into strinfo vector, negative value stands for
@@ -961,6 +962,8 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi)
   tree src;
   gimple stmt = gsi_stmt (*gsi);
   tree lhs = gimple_call_lhs (stmt);
+  gimple retbnd_stmt = NULL;
+  tree retbnd = NULL;
 
   if (lhs == NULL_TREE)
 return;
@@ -1009,6 +1012,14 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi)
  TREE_TYPE (rhs)))
rhs = fold_convert_loc (loc, TREE_TYPE (lhs), rhs);
}
+
+ /* Remember passed and returned bounds if any.  */
+ if (gimple_call_with_bounds_p (stmt))
+   {
+ retbnd = chkp_get_call_arg_bounds (gimple_call_arg (stmt, 0));
+ retbnd_stmt = chkp_retbnd_call_by_val (lhs);
+   }
+
  if (!update_call_from_tree (gsi, rhs))
gimplify_and_update_call_from_tree (gsi, rhs);
  stmt = gsi_stmt (*gsi);
@@ -1018,6 +1029,18 @@ handle_builtin_strchr (gimple_stmt_iterator *gsi)
  fprintf (dump_file, "into: ");
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
}
+
+ /* Replace retbnd call with assignment.  */
+ if (retbnd_stmt)
+   {
+ gimple_stmt_iterator ret_gsi = gsi_for_stmt (retbnd_stmt);
+
+ if (!update_call_from_tree (&ret_gsi, retbnd))
+   gimplify_and_update_call_from_tree (&ret_gsi, retbnd);
+  retbnd_stmt = gsi_stmt (ret_gsi);
+  update_stmt (retbnd_stmt);
+}
+
  if (si != NULL
  && si->endptr == NULL_TREE
  && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))


Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p

2013-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 11:50:45AM +0100, Richard Biener wrote:
> On Sat, Nov 16, 2013 at 1:55 PM, Richard Sandiford
>  wrote:
> > Add tree_fits_shwi_p and tree_fits_uhwi_p.  The implementations are taken
> > directly from host_integerp.
> 
> tree_ is a bit generic - you only ever return true for INTEGER_CSTs,
> so please use int_fits_[su]hwi_p please (mimicing int_fits_type_p).
> 
> Ok with that change.

I actually think tree_fits is better, int_fits* looks too ambiguous to me,
what kind of integer (tree, int, HWI, ...) you actually mean.
It also matches better say the preexisting tree_to_double_int.

Jakub


Re: [5/10] Add tree_to_shwi and tree_to_uhwi

2013-11-18 Thread Richard Sandiford
Richard Biener  writes:
> On Sat, Nov 16, 2013 at 2:03 PM, Richard Sandiford
>  wrote:
>> Add tree_to_shwi and tree_to_uhwi.  Initially tree_to_uhwi returns a
>> HOST_WIDE_INT, so that it's a direct replacement for tree_low_cst.
>> Patch 10 makes it return unsigned HOST_WIDE_INT instead.
>
> Possibly same naming issue (though we already do have the weird
> int_cst_value) - int_to_[us]hwi ()?
>
> OTOH if somebody else is fine with using tree_ for the four functions
> then call it a day.

:-)  I'm OK either way.  It'd certainly be no problem to change the names.

Thanks,
Richard



Re: [1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p

2013-11-18 Thread Eric Botcazou
> I actually think tree_fits is better, int_fits* looks too ambiguous to me,
> what kind of integer (tree, int, HWI, ...) you actually mean.

Seconded, all the more so that int_fits_type_p requires an INTEGER_CST as 
argument, unlike tree_fits_[su]hwi_p which accept any tree as argument.

-- 
Eric Botcazou


Re: Re: [Patch] Fix gcc.dg/20050922-*.c

2013-11-18 Thread Vidya Praveen

Mike,

On 25/10/13 00:37, Mike Stump wrote:

On Oct 24, 2013, at 2:26 AM, Vidya Praveen  wrote:

On Mon, Oct 21, 2013 at 06:40:28PM +0100, Mike Stump wrote:

On Oct 21, 2013, at 3:28 AM, Vidya Praveen  wrote:

Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can
be a issue especially since they define uint32_t.



OK for 4.7, 4.8?


It fails on arm-none-eabi.


Ok, let it bake on trunk and then you can back port it if no one screams.



I think it has baked long enough. Could this be approved for 4.7 and 4.8 now?

VP.



Re: [0/10] Replace host_integerp and tree_low_cst

2013-11-18 Thread Richard Biener
On Mon, Nov 18, 2013 at 10:08 AM, Richard Sandiford
 wrote:
> Jeff Law  writes:
>> On 11/16/13 05:53, Richard Sandiford wrote:
>>> After the patch that went in yesterday, all calls to host_integerp and
>>> tree_low_cst pass a constant "pos" argument.  This series replaces each
>>> function with two separate ones:
>> [ ... ]
>> So I've almost entirely ignored the whole wide-int conversion discussion
>> and I suspect I'm not entirely alone.
>>
>> Can you briefly summarize what's y'all are trying to accomplish with the
>> wide-int changes?
>
> At the moment, we can only handle tree and rtl integer constants that
> fit in 2 HOST_WIDE_INTs.  The idea is to remove that limit.  E.g. things
> like OImode (used in a few ports) will become a first-class citizen,
> with all OImode values being representable.
>
> Besides that headline reason, there are various side benefits.  E.g.:
>
> - All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in
>   "infinite" precision, which isn't possible for 128-bit constants today.
>   (I.e. there's no way to distinguish signed and unsigned 128-bit constants
>   in a double_int.)
>
> - Wider-than-2-HWI intermediate results can be represented as a single
>   integer.  I'm told this is useful for VRP.  (wide-int is mostly Kenny
>   and Mike's work, I've just been butting in recently.)
>
> - rtl-level constant folding can use the same code to handle all
>   combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT,
>   on converted ports).  At the moment we handle CONST_INT cases
>   specially, and don't try as hard with CONST_DOUBLEs.
>
> Implementation-wise, it tries to make it so that the common single-HWI
> cases are still fast.

Sadly CONST_WIDE_INTs don't get a mode:

rtx
immed_wide_int_const (const wide_int &v, enum machine_mode mode)
{
...
/* It is so tempting to just put the mode in here.  Must control
   myself ... */
PUT_MODE (value, VOIDmode);
CWI_PUT_NUM_ELEM (value, len);

so much for an incentive to get more targets converted...
(only after all targets are converted we possibly can merge CONST_INT
and CONST_WIDE_INT).

Richard.

> Thanks,
> Richard


Re: Clean up LTO generation

2013-11-18 Thread Richard Biener
On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt  wrote:
> I'm looking into using LTO to pass information between compilers for
> different targets, for the OpenACC implementation. This area of the
> compiler seems somewhat obfuscated by overengineering, and I'd like to
> simplify it first to make it easier to work with.
>
> lto_gimple_out and lto_finish_out aren't real passes, only their
> write_summary methods are nonnull. The only thing we really do to emit
> LTO is just to make two function calls, to lto_output and
> produce_asm_for_decls, but this is wrapped in a lot of pass_manager
> boilerplate to confuse the reader. Other downsides are bogus empty dump
> files left behind, and plain dead code like
>
>   ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes);
>
> The following patch simplifies this by getting rid of
> all_lto_gen_passes. We could simplify a little further if we decided we
> really don't need two different timevars for different steps of LTO
> output.  Bootstrapped and tested on x86_64-linux, ok?

I'm fine with this - Honza, any objections?

Thanks,
Richard.

>
> Bernd


Re: [RFA][PATCH]Fix 59019

2013-11-18 Thread Steven Bosscher
On Mon, Nov 18, 2013 at 5:36 AM, Jeff Law  wrote:
> On 11/17/13 04:28, Steven Bosscher wrote:
>>
>>
>> TRAP_CONDITION (PATTERN (i3)) == const1_rtx
>>
>> But shouldn't the check be on const_true_rtx? Or does combine put a
>> const1_rtx there?
>
> I took const1_rtx from control_flow_insn_p.  That's ultimately what we need
> to be consistent with.

Hmm, I agree but look at the test in ifcvt.c...

>> Bah... Emitting the barrier is necessary here because
>> fixup_reorder_chain doesn't handle cases where a basic block is a dead
>> end. That is actually a bug in fixup_reorder_chain: Other passes could
>> create dead ends in the CFG in cfglayout mode and not emit a barrier
>> into BB_FOOTER, and fixup_reorder_chain wouldn't be able to handle
>> that (resulting in verify_flow_info failure).
>
> Umm, no.  Failure to emit the barrier will result in a checking failure.
> Been there, done that.

Read my comment again: "... is necessary ..." and "...
fixup_reorder_chain should emit a BARRIER ...".

So yes, you need to emit that barrier with GCC as-is. But in a
perfect, ideal world where everyone is happy all the time, the sun
always shines, lunch is always free, and everything smells like red
roses at dawn, you shouldn't have to emit a BARRIER when the compiler
is in cfglayout mode.

Instead, fixup_reorder_chain *should* do it for you, but obviously
doesn't. There are several places where GCC code emits barriers while
in cfglayout mode, and that makes no sense because barriers are
meaningless in cfglayout mode :-)


> To trigger use the test in 59019 with an itanic cross compiler and comment
> out these two lines from gimple-ssa-isolate-paths.c:
>
>
>   TREE_THIS_VOLATILE (op) = 1;
>   TREE_SIDE_EFFECTS (op) = 1;

OK, thanks. Let's see if I can tackle this one.

Ciao!
Steven


[PATCH, MPX, 2/X] Pointers Checker [20/25] Debug info

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to avoid ICE in debug info generator for instrumented code.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE.
* dwarf2out.c (gen_subprogram_die): Ignore bound args.
(gen_type_die_with_usage): Skip pointer bounds.
(dwarf2out_global_decl): Likewise.


diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 84f765e..2e97b9f 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -2365,6 +2365,10 @@ dbxout_type (tree type, int full)
   dbxout_type (TREE_TYPE (type), 0);
   break;
 
+case POINTER_BOUNDS_TYPE:
+  /* No debug info for pointer bounds type supported yet.  */
+  break;
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 3822a65..f7e97bf 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -18284,7 +18284,7 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
gen_formal_parameter_pack_die (generic_decl_parm,
   parm, subr_die,
   &parm);
- else if (parm)
+ else if (parm && !POINTER_BOUNDS_P (parm))
{
  dw_die_ref parm_die = gen_decl_die (parm, NULL, subr_die);
 
@@ -18296,6 +18296,8 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 
  parm = DECL_CHAIN (parm);
}
+ else if (parm)
+   parm = DECL_CHAIN (parm);
 
  if (generic_decl_parm)
generic_decl_parm = DECL_CHAIN (generic_decl_parm);
@@ -19794,6 +19796,7 @@ gen_type_die_with_usage (tree type, dw_die_ref 
context_die,
 case FIXED_POINT_TYPE:
 case COMPLEX_TYPE:
 case BOOLEAN_TYPE:
+case POINTER_BOUNDS_TYPE:
   /* No DIEs needed for fundamental types.  */
   break;
 
@@ -20441,7 +20444,8 @@ dwarf2out_global_decl (tree decl)
  declarations, file-scope (extern) function declarations (which
  had no corresponding body) and file-scope tagged type declarations
  and definitions which have not yet been forced out.  */
-  if (TREE_CODE (decl) != FUNCTION_DECL || !DECL_INITIAL (decl))
+  if ((TREE_CODE (decl) != FUNCTION_DECL || !DECL_INITIAL (decl))
+  && !POINTER_BOUNDS_P (decl))
 dwarf2out_decl (decl);
 }
 


Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx patterns in IVOPT

2013-11-18 Thread Bernd Schmidt
On 11/04/2013 04:31 AM, bin.cheng wrote:
> 2013-11-01  Bin Cheng  
> 
>   * tree-ssa-loop-ivopts.c (enum ainc_type): New.
>   (address_cost_data): New field.
>   (get_address_cost): Compute auto-increment rtx cost in ainc_costs.
>   Use ainc_costs for auto-increment rtx patterns.
>   Cleanup TWS.

I think this is fine. I'd just like to see AINC_NUM gone and its use
replaced by AIC_NONE, we don't really need two separate enum codes for that.


Bernd



RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-11-18 Thread Bernd Edlinger
Hi,


This modified test case exposes a bug in the already approved part of the 
strict-volatile-bitfields patch:

#include 

typedef struct {
  char pad;
  int arr[0];
} __attribute__((packed)) str;

str *
foo (int* src)
{
  str *s = malloc (sizeof (str) + sizeof (int));
  s->arr[0] = 0x12345678;
  asm volatile("":::"memory");
  *src = s->arr[0];
  return s;
}


As we know this test case triggered a recursion in the store_bit_field on ARM 
and on PowerPC,
which is no longer reproducible after this patch is applied: 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html

Additionally it triggered a recursion on extract_bit_field, but _only_ on my 
local copy of the trunk.
I had this patch installed, but did not expect it to change anything unless the 
values are volatile.
That was cased by this hunk in the strict-volatile-bitfields v4 patch:


@@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo
 includes the entire field.  If such a mode would be larger than
 a word, we won't be doing the extraction the normal way.  */

-  if (MEM_VOLATILE_P (op0)
- && flag_strict_volatile_bitfields> 0)
-   {
- if (GET_MODE_BITSIZE (GET_MODE (op0))> 0)
-   mode = GET_MODE (op0);
- else if (target && GET_MODE_BITSIZE (GET_MODE (target))> 0)
-   mode = GET_MODE (target);
- else
-   mode = tmode;
-   }
-  else
-   mode = get_best_mode (bitsize, bitnum, 0, 0,
- MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
+  mode = GET_MODE (op0);
+  if (GET_MODE_BITSIZE (mode) == 0
+ || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode))
+   mode = word_mode;
+  mode = get_best_mode (bitsize, bitnum, 0, 0,
+   MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));

   if (mode == VOIDmode)
    /* The only way this should occur is if the field spans word
   boundaries.  */
    return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);

So the problem started, because initially this function did not look at 
GET_MODE(op0)
and always used word_mode. That was changed, but now also affected non-volatile 
data.


Now, if we solve this differently and install the C++ memory model patch,
we can avoid to introduce the recursion in the extract path,
and remove these two hunks in the update patch at the same time:

+  else if (MEM_P (str_rtx)
+  && MEM_VOLATILE_P (str_rtx)
+  && flag_strict_volatile_bitfields> 0)
+    /* This is a case where -fstrict-volatile-bitfields doesn't apply
+   because we can't do a single access in the declared mode of the field.
+   Since the incoming STR_RTX has already been adjusted to that mode,
+   fall back to word mode for subsequent logic.  */
+    str_rtx = adjust_address (str_rtx, word_mode, 0);



Attached you'll find a new version of the bitfields-update patch,
it is again relative to the already approved version of the volatile-bitfields 
patch v4, part 1/2.

Boot-strapped and regression-tested on X86_64-linux-gnu.
additionally tested with an ARM cross-compiler.


OK for trunk?


Thanks
Bernd.2013-11-18  Bernd Edlinger  
Sandra Loosemore  

PR middle-end/23623
PR middle-end/48784
PR middle-end/56341
PR middle-end/56997
* expmed.c (strict_volatile_bitfield_p): Add bitregion_start
and bitregion_end parameters.  Test for compliance with C++
memory model.
(store_bit_field): Adjust call to strict_volatile_bitfield_p.
Add fallback logic for cases where -fstrict-volatile-bitfields
is supposed to apply, but cannot.
(extract_bit_field): Likewise. Use narrow_bit_field_mem and
extract_fixed_bit_field_1 to do the extraction.
(extract_fixed_bit_field): Revert to previous mode selection algorithm.
Call extract_fixed_bit_field_1 to do the real work.
(extract_fixed_bit_field_1): New function.

testsuite:
2013-11-18  Bernd Edlinger  
Sandra Loosemore  

* gcc.dg/pr23623.c: Update to test interaction with C++
memory model.




patch-bitfields-update-1.diff
Description: Binary data


Re: [PATCH] Support -fsanitize=leak

2013-11-18 Thread Marek Polacek
On Fri, Nov 15, 2013 at 08:41:38PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> This patch adds support for -fsanitize=leak and -static-liblsan options.
> If combined with -fsanitize=address, it does nothing, otherwise it links
> in liblsan, a new shared+static library (on x86_64-linux only so far,
> the code isn't 32-bit ready apparently).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2013-11-14  Jakub Jelinek  
> 
>   PR sanitizer/59061
>   * common.opt (static-liblsan): Add.
>   * config/gnu-user.h (STATIC_LIBLSAN_LIBS, STATIC_LIBUBSAN_LIBS):
>   Define.
>   * flag-types.h (enum sanitize_code): Add SANITIZE_LEAK.  Renumber
>   SANITIZE_SHIFT, SANITIZE_DIVIDE, SANITIZE_UNREACHABLE, SANITIZE_VLA.
>   * opts.c (common_handle_option): Handle -fsanitize=leak.
>   * gcc.c (ADD_STATIC_LIBLSAN_LIBS, LIBLSAN_SPEC): Define.
>   (LIBUBSAN_SPEC): Don't test LIBUBSAN_EARLY_SPEC.
>   (LIBUBSAN_EARLY_SPEC): Remove.
>   (SANITIZER_EARLY_SPEC): Don't do anything for libubsan.
>   (SANITIZER_SPEC): Add -fsanitize=leak handling.
>   (sanitize_spec_function): Handle %sanitize(leak).
> 
>   * configure.tgt: Set LSAN_SUPPORTED=yes for x86_64-linux.
>   * configure.ac (LSAN_SUPPORTED): New AM_CONDITIONAL.
>   * lsan/Makefile.am (toolexeclib_LTLIBRARIES, lsan_files,
>   liblsan_la_SOURCES, liblsan_la_LIBADD, liblsan_la_LDFLAGS): Add.
>   * lsan/Makefile.in: Regenerated.

Looks good to me.  Moreover, it fixes the bootstrap-ubsan failure when
building fixincl - I guess the gnu-user.h hunk was missing.  Thanks.

Marek


Re: [0/10] Replace host_integerp and tree_low_cst

2013-11-18 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, Nov 18, 2013 at 10:08 AM, Richard Sandiford
>  wrote:
>> Jeff Law  writes:
>>> On 11/16/13 05:53, Richard Sandiford wrote:
 After the patch that went in yesterday, all calls to host_integerp and
 tree_low_cst pass a constant "pos" argument.  This series replaces each
 function with two separate ones:
>>> [ ... ]
>>> So I've almost entirely ignored the whole wide-int conversion discussion
>>> and I suspect I'm not entirely alone.
>>>
>>> Can you briefly summarize what's y'all are trying to accomplish with the
>>> wide-int changes?
>>
>> At the moment, we can only handle tree and rtl integer constants that
>> fit in 2 HOST_WIDE_INTs.  The idea is to remove that limit.  E.g. things
>> like OImode (used in a few ports) will become a first-class citizen,
>> with all OImode values being representable.
>>
>> Besides that headline reason, there are various side benefits.  E.g.:
>>
>> - All INTEGER_CSTs can be viewed either in their TYPE_PRECISION or in
>>   "infinite" precision, which isn't possible for 128-bit constants today.
>>   (I.e. there's no way to distinguish signed and unsigned 128-bit constants
>>   in a double_int.)
>>
>> - Wider-than-2-HWI intermediate results can be represented as a single
>>   integer.  I'm told this is useful for VRP.  (wide-int is mostly Kenny
>>   and Mike's work, I've just been butting in recently.)
>>
>> - rtl-level constant folding can use the same code to handle all
>>   combinations of CONST_INT and CONST_DOUBLE (and CONST_WIDE_INT,
>>   on converted ports).  At the moment we handle CONST_INT cases
>>   specially, and don't try as hard with CONST_DOUBLEs.
>>
>> Implementation-wise, it tries to make it so that the common single-HWI
>> cases are still fast.
>
> Sadly CONST_WIDE_INTs don't get a mode:
>
> rtx
> immed_wide_int_const (const wide_int &v, enum machine_mode mode)
> {
> ...
> /* It is so tempting to just put the mode in here.  Must control
>myself ... */
> PUT_MODE (value, VOIDmode);
> CWI_PUT_NUM_ELEM (value, len);
>
> so much for an incentive to get more targets converted...
> (only after all targets are converted we possibly can merge CONST_INT
> and CONST_WIDE_INT).

Yeah, but the other requirement for merging CONST_INT and CONST_WIDE_INT
is that they both treat the mode field in the same way.  Which means that
all existing CONST_INT code needs to be converted to store the mode first.
Changing whether CONST_WIDE_INT stores a mode is trivial compared to that.

Adding modes to the rtx integer constants is something I definitely want
to do (and have some local patches towards), but it's going to take a while.
Until then I think it would just be too confusing to have a TImode 0
stored without a mode but a TImode 1 << 100 (say) stored with a mode.

In the meantime, the incentive for converting targets is so that we can
stop using CONST_DOUBLE for integers.  Plus the interfaces are IMO nicer
with wi::...

Thanks,
Richard


[PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch to add size relocation and instruction to obtain object's size 
in i386 target.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* config/i386/i386.md (UNSPEC_SIZEOF): New.
(move_size_reloc_): New.
* config/i386/predicates.md (size_relocation): New.
(x86_64_zext_immediate_operand): Support UNSPEC_SIZEOF.
* config/i386/i386.c (output_pic_addr_const): Support UNSPEC_SIZEOF.
(i386_asm_output_addr_const_extra): Likewise.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d581b96..a427c15 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13809,6 +13809,10 @@ output_pic_addr_const (FILE *file, rtx x, int code)
  machopic_output_function_base_name (file);
  break;
 #endif
+   case UNSPEC_SIZEOF:
+ fputs ("@SIZE", file);
+ break;
+
default:
  output_operand_lossage ("invalid UNSPEC as operand");
  break;
@@ -15387,6 +15391,11 @@ i386_asm_output_addr_const_extra (FILE *file, rtx x)
   }
   break;
 
+case UNSPEC_SIZEOF:
+  output_addr_const (file, op);
+  fputs ("@SIZE", file);
+  break;
+
 default:
   return false;
 }
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e23b3b6..97dd89c 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -79,6 +79,7 @@
   UNSPEC_PLTOFF
   UNSPEC_MACHOPIC_OFFSET
   UNSPEC_PCREL
+  UNSPEC_SIZEOF
 
   ;; Prologue support
   UNSPEC_STACK_ALLOC
@@ -18446,6 +18447,14 @@
   "bndstx\t{%2, %3|%3, %2}"
   [(set_attr "type" "mpxst")])
 
+(define_insn "move_size_reloc_"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+(match_operand: 1 "size_relocation" "Z"))]
+  ""
+{
+  return "mov{}\t{%1, %0|%0, %1}";
+})
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 66ac52f..23231b1 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -119,6 +119,15 @@
(match_test "TARGET_64BIT")
(match_test "REGNO (op) > BX_REG")))
 
+;; Return true if VALUE is size relocation
+(define_predicate "size_relocation"
+  (match_code "const")
+{
+  return (GET_CODE (op) == CONST
+  && GET_CODE (XEXP (op, 0)) == UNSPEC
+ && XINT (XEXP (op, 0), 1) == UNSPEC_SIZEOF);
+})
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate "x86_64_immediate_operand"
   (match_code "const_int,symbol_ref,label_ref,const")
@@ -323,6 +332,13 @@
  return false;
}
}
+  else if (GET_CODE (XEXP (op, 0)) == UNSPEC)
+{
+  if (XINT (XEXP (op, 0), 1) == UNSPEC_SIZEOF
+ && XVECLEN (XEXP (op, 0), 0) == 1
+ && GET_CODE (XVECEXP (XEXP (op, 0), 0, 0)) == SYMBOL_REF)
+   return true;
+}
   break;
 
 default:


Re: Clean up LTO generation

2013-11-18 Thread Jan Hubicka
> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt  
> wrote:
> > I'm looking into using LTO to pass information between compilers for
> > different targets, for the OpenACC implementation. This area of the
> > compiler seems somewhat obfuscated by overengineering, and I'd like to
> > simplify it first to make it easier to work with.
> >
> > lto_gimple_out and lto_finish_out aren't real passes, only their
> > write_summary methods are nonnull. The only thing we really do to emit
> > LTO is just to make two function calls, to lto_output and
> > produce_asm_for_decls, but this is wrapped in a lot of pass_manager
> > boilerplate to confuse the reader. Other downsides are bogus empty dump
> > files left behind, and plain dead code like
> >
> >   ipa_read_optimization_summaries_1 (passes->all_lto_gen_passes);
> >
> > The following patch simplifies this by getting rid of
> > all_lto_gen_passes. We could simplify a little further if we decided we
> > really don't need two different timevars for different steps of LTO
> > output.  Bootstrapped and tested on x86_64-linux, ok?
> 
> I'm fine with this - Honza, any objections?

Not at all, I alos considered this somewhat pointless.

Honza
> 
> Thanks,
> Richard.
> 
> >
> > Bernd


[PATCH] Fix libbacktrace on prelinked shared libraries

2013-11-18 Thread Jakub Jelinek
Hi!

info->dlpi_addr == 0 is true for executables (but those have also
info->dlpi_name set to ""), but not just for those, it is true also
for successfully prelinked shared libraries or any other libraries that
have been linked at non-zero offset and the dynamic linker managed to mmap
them where they were prelinked resp. linked to.

BTW, I'm surprised you aren't adding base_address (well, in fact it is
actually not base address, but load bias, difference between load address
and link base address) to elf_initialize_syminfo created symbol addresses,
because all of them should be biased by the load address.
And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols,
those shouldn't be biased but not even entered into the table.

2013-11-18  Jakub Jelinek  

* elf.c (phdr_callback): Process info->dlpi_addr == 0 normally.

--- libbacktrace/elf.c.jj   2013-11-18 09:59:08.0 +0100
+++ libbacktrace/elf.c  2013-11-18 12:48:07.780700382 +0100
@@ -863,12 +863,8 @@ phdr_callback (struct dl_phdr_info *info
   fileline elf_fileline_fn;
   int found_dwarf;
 
-  /* There is not much we can do if we don't have the module name.  If
- the base address is 0, this is probably the executable, which we
- already loaded.  */
-  if (info->dlpi_name == NULL
-  || info->dlpi_name[0] == '\0'
-  || info->dlpi_addr == 0)
+  /* There is not much we can do if we don't have the module name.  */
+  if (info->dlpi_name == NULL || info->dlpi_name[0] == '\0')
 return 0;
 
   descriptor = backtrace_open (info->dlpi_name, pd->error_callback, pd->data,

Jakub


Re: Clean up LTO generation

2013-11-18 Thread Bernd Schmidt
On 11/18/2013 02:03 PM, Jan Hubicka wrote:
>> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt  
>> wrote:
>>> The following patch simplifies this by getting rid of
>>> all_lto_gen_passes. We could simplify a little further if we decided we
>>> really don't need two different timevars for different steps of LTO
>>> output.  Bootstrapped and tested on x86_64-linux, ok?
>>
>> I'm fine with this - Honza, any objections?
> 
> Not at all, I alos considered this somewhat pointless.

Ok, cool. Any opinions on the two timevars, should I reduce that before
committing?


Bernd



Re: [PATCH] Fix PRs59125 and 54570

2013-11-18 Thread Richard Biener
On Fri, 15 Nov 2013, Richard Biener wrote:

> On Fri, 15 Nov 2013, Jakub Jelinek wrote:
> 
> > On Fri, Nov 15, 2013 at 02:56:51PM +0100, Richard Biener wrote:
> > > Now that there is (finally :() a wrong-code testcase for the
> > > PR54570 issue we can no longer ignore it (bah).  So the following
> > > tries to paper over the fact that object-size sucks and disables
> > > value-numbering of equal addresses the same before that pass
> > > had a chance to finally look at the structure of the addresses.
> > > 
> > > To make this "fix" suck less I moved the object-size pass before
> > > the final FRE pass runs which is after IPA inlining and the
> > > propagation of constants and addresses.  You won't catch
> > > any "improvements" you'd get by memory CSE opportunities that
> > > IPA inlining exposes, but you cannot have everything here.
> > 
> > If it doesn't regress anything in the testsuite, I guess that is ok.
> > 
> > > (IMHO object-size would should run during early optimizations)
> > 
> > It can't, because inlining and some limited cleanup afterwards is essential
> > for it.  Otherwise you'd regress not just for __builtin_object_size (x, 1),
> > which admittedly is problematic since the introduction of MEM_REFs and
> > various other changes, but also for __builtin_object_size (x, 0), which
> > would be much more important.
> > 
> > As discussed earlier, perhaps instead of checking cfun->after_inlining
> > you could just introduce a new flag whether cfun contains any
> > __builtin_object_size (x, {1,3}) calls, initialized by the gimplifier,
> > propagated by the inliner and finally cleared again by objsz pass.
> 
> But you'd need to pessimistically initialize it because if you inline
> into a function with __builtin_object_size you may not previously
> optimize.  You can of course analyze the cgraph to clear it for
> functions that cannot end up being inlined into such function.
> So that's effectively the same as ->after_inlining
> minus losing the optimization that didn't end up with 
> __builtin_object_size after that.
> 
> Not sure if it's worth all the trouble.  Arriving at a better
> design for computing __builtin_object_size would be better ;)
> (not that I have one)
> 
> > Of course, if moving objsz earlier seems to work, it could stay where you
> > put it, but the flag could make it clearer why you want to avoid certain
> > optimizations.
> 
> Well, all object-size testcases are pretty simplistic right now and
> don't trigger the IPA inliner for example.
> 
> > > Bootstrap / regtest pending on x86_64-unknown-linux-gnu.
> > > 
> > > Similar candidate for the 4.8 branch.
> > 
> > Please wait sufficiently long for trunk issues before you backport.
> 
> Of course.

So I had to do some more changes because doing objsz earlier (and
separating it from strlenopt) exposes that the GENERIC folders (ugh)
for strcat mess up points-to info as it folds strcat to
memcpy (SAVE_EXPR , ...) which adds a new
SSA name for the destination with no alias info associated.

Rather than moving the non-constant builtin foldings to GIMPLE
I chose to make sure to forward the constants in objsz and fold
the stmts there (as PTA is re-run shortly after it).

This then breaks some of the scan-tree-dumps in the strlenopt
testcases because nobody has previously folded the _chk builtins
with -1U size to non-_chk variants, so I have to adjust them
(I didn't want to move the strlenopt pass as well).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Any objections?

Thanks,
Richard.

2013-11-18  Richard Biener  

PR tree-optimization/59125
PR tree-optimization/54570
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): When inlining
is not complete do not treat component-references with offset zero
but different fields as equal.
* tree-object-size.c: Include tree-phinodes.h and ssa-iterators.h.
(compute_object_sizes): Apply TLC.  Propagate the constant
results into all uses and fold their stmts.
* passes.def (pass_all_optimizations): Move pass_object_sizes
after the first pass_forwprop and before pass_fre.

* gcc.dg/builtin-object-size-8.c: Un-xfail.
* gcc.dg/builtin-object-size-14.c: New testcase.
* gcc.dg/strlenopt-14gf.c: Adjust.
* gcc.dg/strlenopt-1f.c: Likewise.
* gcc.dg/strlenopt-4gf.c: Likewise.

Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c.orig   2013-11-18 11:20:36.0 +0100
--- gcc/tree-ssa-sccvn.c2013-11-18 11:29:12.300337512 +0100
*** copy_reference_ops_from_ref (tree ref, v
*** 760,766 
  }
  
/* For non-calls, store the information that makes up the address.  */
! 
while (ref)
  {
vn_reference_op_s temp;
--- 760,766 
  }
  
/* For non-calls, store the information that makes up the address.  */
!   tree orig = ref;
while (ref)
  {
vn_ref

Re: Clean up LTO generation

2013-11-18 Thread Richard Biener
On Mon, Nov 18, 2013 at 2:19 PM, Bernd Schmidt  wrote:
> On 11/18/2013 02:03 PM, Jan Hubicka wrote:
>>> On Mon, Nov 18, 2013 at 11:10 AM, Bernd Schmidt  
>>> wrote:
 The following patch simplifies this by getting rid of
 all_lto_gen_passes. We could simplify a little further if we decided we
 really don't need two different timevars for different steps of LTO
 output.  Bootstrapped and tested on x86_64-linux, ok?
>>>
>>> I'm fine with this - Honza, any objections?
>>
>> Not at all, I alos considered this somewhat pointless.
>
> Ok, cool. Any opinions on the two timevars, should I reduce that before
> committing?

No, it's useful to have both separated.

Thanks,
Richard.

>
> Bernd
>


[PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Jakub Jelinek
Hi!

This patch adds libbacktrace symbolizer to libsanitizer, with which we
can avoid spawning and running external an external program (especially when
it is not even provided by gcc itself).

I've kept the possibility to override the symbolizer by magic symbols
(InternalSymbolizer), and as I have no idea how the llvm buildsystem etc.
works and what is the possibility there to add libbacktrace, this just
requires users to compile with SANITIZE_LIBBACKTRACE defined to signal
that backtrace-supported.h and backtrace.h is available and the
sanitizer_symbolizer_posix_libcdep.cc source then decides based on
backtrace-supported.h etc. whether it is usable.

make check RUNTESTFLAGS='asan.exp ubsan.exp' passes with this.

Some pending issues on the libbacktrace side:
1) right now libbacktrace performs dl_iterate_phdr only the first time
   backtrace_pcinfo or backtrace_syminfo is called, if there are some
   dlopens/dlcloses in between that and another querying of the symbolizer,
   it won't notice that.  Perhaps it can be done only when we don't
   find a symbol and/or have some function that tries to dl_iterare_phdr
   again, look at cached st_ino/st_mtime or similar, and for threaded
   version likely just keep old records, just add a flag to them that they
   should be ignored (or say atomically decrease symbol count to zero
   and something similar for debug info).
2) for tsan querying of data symbols, apparently the classes want to see
   not just the symbol name and start value, but also size.  libbacktrace
   has all this info available, just doesn't pass it down to the callback.
   I wonder if we'd need to create yet another libbacktrace entrypoint, or
   if it would be acceptable to do source code incompatible, ABI (at least
   on all sane targets) compatible version of just adding another
   uintptr_t symsize argument to backtrace_syminfo_callback.
3) I wonder if libbacktrace couldn't be updated to use __atomic_* builtins,
   then it could avoid the ugliness to emulate atomic loads and stores.

As for sanitizer, the reason I haven't implemented SendCommand method for
the libbacktrace symbolizer is that the library doesn't provide the info
as text, but as individual values passed to the callback, so printing
that to text and then parsing the text would be very ugly.  libbacktrace
doesn't even need the module names and module offsets, so supposedly
we would need that only if libbacktrace failed to get accurate inline/call
or symbol info.  While the classes have both symbol name/file/line etc.
and module name/offset fields, apparently the latter are unused if the
former is filled in.

2013-11-18  Jakub Jelinek  

PR sanitizer/59136
* configure.ac: Don't add target-libbacktrace to noconfigdirs
just because go hasn't been enabled if target-libsanitizer isn't in
noconfigdirs.
* Makefile.def: Add configure-target-libsanitizer dependency on
configure-target-libbacktrace and all-target-libsanitizer dependency
on configure-target-libsanitizer.
* configure: Regenerated.
libsanitizer/
* sanitizer_common/Makefile.am (AM_CPPFLAGS): If
SANITIZER_LIBBACKTRACE, append -I for libbacktrace headers and
-DSANITIZER_LIBBACKTRACE.
* sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Add
libbacktrace symbolizer.
* tsan/Makefile.am (libtsan_la_LIBADD): Add libbacktrace.la if
SANITIZER_LIBBACKTRACE.
* asan/Makefile.am (libasan_la_LIBADD): Likewise.
* ubsan/Makefile.am (libubsan_la_LIBADD): Likewise.
* configure.ac (SANITIZER_LIBBACKTRACE): New AM_CONDITIONAL.
* sanitizer_common/Makefile.in: Regenerated.
* tsan/Makefile.in: Regenrated.
* asan/Makefile.in: Regenerated.
* ubsan/Makefile.in: Regenerated.
* configure: Regenerated.
gcc/testsuite/
* c-c++-common/asan/strip-path-prefix-1.c: Allow also the
filename:line instead of (modulename+offset) form with stripped
initial / from the filename.

--- libsanitizer/sanitizer_common/Makefile.am.jj2013-11-18 
09:59:04.012244260 +0100
+++ libsanitizer/sanitizer_common/Makefile.am   2013-11-18 11:04:54.241360825 
+0100
@@ -1,5 +1,8 @@
-AM_CPPFLAGS = -I $(top_srcdir)/include 
- 
+AM_CPPFLAGS = -I $(top_srcdir)/include
+if SANITIZER_LIBBACKTRACE
+AM_CPPFLAGS += -I $(top_srcdir)/../libbacktrace -I 
$(top_builddir)/../libbacktrace -DSANITIZER_LIBBACKTRACE
+endif
+
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
--- libsanitizer/sanitizer_common/Makefile.in.jj2013-11-18 
09:59:04.012244260 +0100
+++ libsanitizer/sanitizer_common/Makefile.in   2013-11-18 12:55:52.709310820 
+0100
@@ -35,6 +35,7 @@ POST_UNINSTALL = :
 build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
+@SANITIZER_LIBBACKTRACE_TRUE@am__append_1 = -I $(top_srcdir)/../libbacktrace 
-I $(top_builddir)/../libbacktrace -DSANITIZER_LIBBACKTRACE

Re: [PATCH] Implement -fsanitize=null + new sanopt pass

2013-11-18 Thread Jakub Jelinek
On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote:
> --- gcc/config/bootstrap-ubsan.mk.mp  2013-11-12 13:46:13.345182065 +0100
> +++ gcc/config/bootstrap-ubsan.mk 2013-11-12 13:46:49.812314016 +0100
> @@ -2,6 +2,6 @@
>  
>  STAGE2_CFLAGS += -fsanitize=undefined
>  STAGE3_CFLAGS += -fsanitize=undefined
> -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \
> +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl \

Hopefully with my pending patch you can remove the -lpthread -ldl again, but
ok for now.
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> +   gimple stmt = gsi_stmt (gsi);
> +
> +   if (gimple_code (stmt) != GIMPLE_CALL)

if (is_gimple_call (stmt))

Ok with those changes.

Jakub


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread H.J. Lu
On Mon, Nov 18, 2013 at 1:02 AM, Arnaud Charlet  wrote:
>> Actually there are two issues with your change:
>>
>> Using time_t for tv_nsec looks actually wrong, the definition on (my)
>> linux is:
>>
>> struct timespec {
>> __kernel_time_t tv_sec; /* seconds */
>> longtv_nsec;/* nanoseconds */
>> };

This is wrong for x32 where tv_nsec is long long, not
long.  There are a couple places where long should be
long long for x32. Glibc gets them right properly:

/* A time value that is accurate to the nearest
   microsecond but also has a range of years.  */
struct timeval
  {
__time_t tv_sec;/* Seconds.  */
__suseconds_t tv_usec;  /* Microseconds.  */
  };


>> As you can see, the types for tv_sec and tv_nsec are different.
>>
>> So the change to timespec looks wrong to me.
>
> In addition concerning s-taprop-linux.adb, the definition of struct
> timeval on linux is:
>
> /* A time value that is accurate to the nearest
>microsecond but also has a range of years.  */
> struct timeval
>   {
> __time_t tv_sec;/* Seconds.  */
> __suseconds_t tv_usec;  /* Microseconds.  */
>   };
>
> So again, we have another type (__suseconds_t), and NOT time_t or __time_t
> so it's really wrong for tv_nsec to use time_t.
>

All for Linux architectures, __suseconds_t and
__suseconds_t are the same as time_t, including
x32.  If not, we can't use array for timeval.

-- 
H.J.


[v3 patch] __shared_ptr should never use atomic ops

2013-11-18 Thread Jonathan Wakely
This is something I've been meaning to change for years, see
http://gcc.gnu.org/ml/libstdc++/2007-10/msg00180.html for a prior
post. Currently __shared_ptr specializations still
dispatch through __gnu_cxx::__exchange_and_add_dispatch for ref-count
updates.  The attached patch means the _S_single policy is always
non-atomic, so that you can get a faster (but not threadsafe!) smart
pointer with:

template
  using shared_ptr_unsynchronized
 = std::__shared_ptr;

This is something I've needed and I've seen others ask about it too e.g.
http://stackoverflow.com/q/15129263/981959
http://stackoverflow.com/q/9792668/981959

2013-11-18  Jonathan Wakely  

* include/bits/shared_ptr_base.h (_Sp_counted_base<_S_single>): Use
non-atomic operations.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line number.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.

Tested x86_64-linux, with --enable-threads and --disable-threads,
committed to trunk.
commit 1f4fcb52da0e14389ffd53bca6dc060a30d180c2
Author: Jonathan Wakely 
Date:   Mon Nov 18 11:50:19 2013 +

* include/bits/shared_ptr_base.h (_Sp_counted_base<_S_single>): Use
non-atomic operations.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line number.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index cf90d7a..68ccc9e 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -209,11 +209,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _Sp_counted_base<_S_single>::
 _M_add_ref_lock()
 {
-  if (__gnu_cxx::__exchange_and_add_dispatch(&_M_use_count, 1) == 0)
-   {
- _M_use_count = 0;
- __throw_bad_weak_ptr();
-   }
+  if (_M_use_count == 0)
+   __throw_bad_weak_ptr();
+  ++_M_use_count;
 }
 
   template<>
@@ -248,6 +246,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __ATOMIC_RELAXED));
 }
 
+  template<>
+inline void
+_Sp_counted_base<_S_single>::_M_add_ref_copy()
+{ ++_M_use_count; }
+
+  template<>
+inline void
+_Sp_counted_base<_S_single>::_M_release() noexcept
+{
+  if (--_M_use_count == 0)
+{
+  _M_dispose();
+  if (--_M_weak_count == 0)
+_M_destroy();
+}
+}
+
+  template<>
+inline void
+_Sp_counted_base<_S_single>::_M_weak_add_ref() noexcept
+{ ++_M_weak_count; }
+
+  template<>
+inline void
+_Sp_counted_base<_S_single>::_M_weak_release() noexcept
+{
+  if (--_M_weak_count == 0)
+_M_destroy();
+}
+
+  template<>
+inline long
+_Sp_counted_base<_S_single>::_M_get_use_count() const noexcept
+{ return _M_use_count; }
+
 
   // Forward declarations.
   template
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
index db3fcac..01acca5 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
@@ -32,7 +32,7 @@ void test01()
 {
   X* px = 0;
   std::shared_ptr p1(px);   // { dg-error "here" }
-  // { dg-error "incomplete" "" { target *-*-* } 779 }
+  // { dg-error "incomplete" "" { target *-*-* } 812 }
 
   std::shared_ptr p9(ap());  // { dg-error "here" }
   // { dg-error "incomplete" "" { target *-*-* } 307 }
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
index 3fd38cf..5389159 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
@@ -25,5 +25,5 @@
 void test01()
 {
   std::shared_ptr p((void*)nullptr);   // { dg-error "here" }
-  // { dg-error "incomplete" "" { target *-*-* } 778 }
+  // { dg-error "incomplete" "" { target *-*-* } 811 }
 }


Re: [PATCH] Implement -fsanitize=null + new sanopt pass

2013-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 02:51:41PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote:
> > --- gcc/config/bootstrap-ubsan.mk.mp2013-11-12 13:46:13.345182065 
> > +0100
> > +++ gcc/config/bootstrap-ubsan.mk   2013-11-12 13:46:49.812314016 +0100
> > @@ -2,6 +2,6 @@
> >  
> >  STAGE2_CFLAGS += -fsanitize=undefined
> >  STAGE3_CFLAGS += -fsanitize=undefined
> > -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \
> > +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl 
> > \
> 
> Hopefully with my pending patch you can remove the -lpthread -ldl again, but
> ok for now.
> > +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > +   {
> > + gimple stmt = gsi_stmt (gsi);
> > +
> > + if (gimple_code (stmt) != GIMPLE_CALL)
> 
> if (is_gimple_call (stmt))
> 
> Ok with those changes.

Oh, one more thing, please update gcc/doc/, the -fsanitize= description is
far from up to date there.

Jakub


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread Arnaud Charlet
> >> struct timespec {
> >> __kernel_time_t tv_sec; /* seconds */
> >> longtv_nsec;/* nanoseconds */
> >> };
> 
> This is wrong for x32 where tv_nsec is long long, not
> long.  There are a couple places where long should be
> long long for x32.

Well yes, but the type is *not* time_t, it's conceptually a different type.

> Glibc gets them right properly:

But that's for timeval, not timespec.

> All for Linux architectures, __suseconds_t and
> __suseconds_t are the same as time_t, including
> x32.  If not, we can't use array for timeval.

Well this PR is about x32 being different from all other linux ports. There's
no guarantee that we won't have exactly the same issue on future ports where
__suseconds_t is *not* the same as time_t.

Also there's the issue of s-osinte-solaris-posix.ads which is broken with
your change right now.

The short term solution would be to apply this change:

--- s-osinte-solaris-posix.ads  (revision 298928)
+++ s-osinte-solaris-posix.ads  (working copy)
@@ -513,7 +513,7 @@

type timespec is record
   tv_sec  : time_t;
-  tv_nsec : long;
+  tv_nsec : time_t;
end record;
pragma Convention (C, timespec);

But I'd rather settle on which type is appropriate for the tv_nsec
field before making this change.

Arno


Re: [PATCH] Support -fsanitize=leak

2013-11-18 Thread Jakub Jelinek
On Fri, Nov 15, 2013 at 12:34:14PM -0800, Ian Lance Taylor wrote:
> On Fri, Nov 15, 2013 at 11:41 AM, Jakub Jelinek  wrote:
> >
> > This patch adds support for -fsanitize=leak and -static-liblsan options.
> > If combined with -fsanitize=address, it does nothing, otherwise it links
> > in liblsan, a new shared+static library (on x86_64-linux only so far,
> > the code isn't 32-bit ready apparently).
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Documentation?

Here it is, as incremental patch:

2013-11-18  Jakub Jelinek  

* doc/invoke.texi (-static-liblsan, -fsanitize=leak): Document.

--- gcc/doc/invoke.texi.jj  2013-11-18 09:59:09.0 +0100
+++ gcc/doc/invoke.texi 2013-11-18 14:57:12.240073775 +0100
@@ -454,7 +454,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{@var{object-file-name}  -l@var{library} @gol
 -nostartfiles  -nodefaultlibs  -nostdlib -pie -rdynamic @gol
 -s  -static -static-libgcc -static-libstdc++ @gol
--static-libasan -static-libtsan -static-libubsan @gol
+-static-libasan -static-libtsan -static-liblsan -static-libubsan @gol
 -shared -shared-libgcc  -symbolic @gol
 -T @var{script}  -Wl,@var{option}  -Xlinker @var{option} @gol
 -u @var{symbol}}
@@ -5259,6 +5259,13 @@ Memory access instructions will be instr
 data race bugs.
 See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for 
more details.
 
+@item -fsanitize=leak
+Enable LeakSanitizer, a memory leak detector.
+This option only matters for linking of executables and if neither
+@option{-fsanitize=address} nor @option{-fsanitize=thread} is used.  In that
+case it will link the executable against a library that overrides @code{malloc}
+and other allocator functions.
+
 @item -fsanitize=undefined
 Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
 Various computations will be instrumented to detect undefined behavior
@@ -10181,6 +10188,15 @@ option is not used, then this links agai
 driver to link @file{libtsan} statically, without necessarily linking
 other libraries statically.
 
+@item -static-liblsan
+When the @option{-fsanitize=leak} option is used to link a program,
+the GCC driver automatically links against @option{liblsan}.  If
+@file{liblsan} is available as a shared library, and the @option{-static}
+option is not used, then this links against the shared version of
+@file{liblsan}.  The @option{-static-liblsan} option directs the GCC
+driver to link @file{liblsan} statically, without necessarily linking
+other libraries statically.
+
 @item -static-libubsan
 When the @option{-fsanitize=undefined} option is used to link a program,
 the GCC driver automatically links against @option{libubsan}.  If


Jakub


Re: [PATCH] Support -fsanitize=leak

2013-11-18 Thread Marek Polacek
On Mon, Nov 18, 2013 at 01:15:02PM +0100, Marek Polacek wrote:
> Moreover, it fixes the bootstrap-ubsan failure when
> building fixincl - I guess the gnu-user.h hunk was missing.  Thanks.

Sorry, turned out this is bogus.  It really will need some
configure/Makefile fix.

Marek


[PATCH, i386, MPX, 2/X] Pointers Checker [22/25] Target builtins

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch introducing i386 target versions of Pointer Bounds Checker 
builtins.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* config/i386/i386-builtin-types.def (BND): New.
(ULONG): New.
(BND_FTYPE_PCVOID_ULONG): New.
(VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
(BND_FTYPE_PCVOID_PCVOID): New.
(BND_FTYPE_PCVOID): New.
(BND_FTYPE_BND_BND): New.
(PVOID_FTYPE_PVOID_PVOID_ULONG): New.
(PVOID_FTYPE_PCVOID_BND_ULONG): New.
(ULONG_FTYPE_VOID): New.
(PVOID_FTYPE_BND): New.
* config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h.
(ix86_builtins): Add
IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
IX86_BUILTIN_BNDSET, IX86_BUILTIN_BNDNARROW,
IX86_BUILTIN_BNDINT, IX86_BUILTIN_ARG_BND,
IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
IX86_BUILTIN_BNDUPPER.
(builtin_isa): Add leaf_p and nothrow_p fields.
(def_builtin): Initialize leaf_p and nothrow_p.
(ix86_add_new_builtins): Handle leaf_p and nothrow_p
flags.
(bdesc_mpx): New.
(bdesc_mpx_const): New.
(ix86_init_mpx_builtins): New.
(ix86_init_builtins): Call ix86_init_mpx_builtins.
(ix86_expand_builtin): expand IX86_BUILTIN_BNDMK,
IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDSET,
IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
IX86_BUILTIN_ARG_BND, IX86_BUILTIN_SIZEOF,
IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.



diff --git a/gcc/config/i386/i386-builtin-types.def 
b/gcc/config/i386/i386-builtin-types.def
index c866170..f82ac9b 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -47,6 +47,7 @@ DEF_PRIMITIVE_TYPE (UCHAR, unsigned_char_type_node)
 DEF_PRIMITIVE_TYPE (QI, char_type_node)
 DEF_PRIMITIVE_TYPE (HI, intHI_type_node)
 DEF_PRIMITIVE_TYPE (SI, intSI_type_node)
+DEF_PRIMITIVE_TYPE (BND, pointer_bounds_type_node)
 # ??? Logically this should be intDI_type_node, but that maps to "long"
 # with 64-bit, and that's not how the emmintrin.h is written.  Again, 
 # changing this would change name mangling.
@@ -60,6 +61,7 @@ DEF_PRIMITIVE_TYPE (USHORT, short_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (INT, integer_type_node)
 DEF_PRIMITIVE_TYPE (UINT, unsigned_type_node)
 DEF_PRIMITIVE_TYPE (UNSIGNED, unsigned_type_node)
+DEF_PRIMITIVE_TYPE (ULONG, long_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (LONGLONG, long_long_integer_type_node)
 DEF_PRIMITIVE_TYPE (ULONGLONG, long_long_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (UINT8, unsigned_char_type_node)
@@ -239,6 +241,7 @@ DEF_FUNCTION_TYPE (V4DI, V8HI)
 DEF_FUNCTION_TYPE (V4DI, V4SI)
 DEF_FUNCTION_TYPE (V4DI, PV4DI)
 DEF_FUNCTION_TYPE (V4DI, V2DI)
+DEF_FUNCTION_TYPE (BND, PCVOID, ULONG)
 
 DEF_FUNCTION_TYPE (DI, V2DI, INT)
 DEF_FUNCTION_TYPE (DOUBLE, V2DF, INT)
@@ -374,6 +377,7 @@ DEF_FUNCTION_TYPE (VOID, PV4DI, V4DI)
 DEF_FUNCTION_TYPE (VOID, PV4SF, V4SF)
 DEF_FUNCTION_TYPE (VOID, PV8SF, V8SF)
 DEF_FUNCTION_TYPE (VOID, UNSIGNED, UNSIGNED)
+DEF_FUNCTION_TYPE (VOID, BND, PCVOID)
 
 DEF_FUNCTION_TYPE (INT, V16QI, V16QI, INT)
 DEF_FUNCTION_TYPE (UCHAR, UINT, UINT, UINT)
@@ -439,6 +443,14 @@ DEF_FUNCTION_TYPE (V8UHI, V8UHI, V8UHI, V8UHI)
 DEF_FUNCTION_TYPE (V16UQI, V16UQI, V16UQI, V16UQI)
 DEF_FUNCTION_TYPE (V4DF, V4DF, V4DF, V4DI)
 DEF_FUNCTION_TYPE (V8SF, V8SF, V8SF, V8SI)
+DEF_FUNCTION_TYPE (VOID, PCVOID, PCVOID, BND)
+DEF_FUNCTION_TYPE (BND, PCVOID, PCVOID)
+DEF_FUNCTION_TYPE (BND, PCVOID)
+DEF_FUNCTION_TYPE (BND, BND, BND)
+DEF_FUNCTION_TYPE (PVOID, PVOID, PVOID, ULONG)
+DEF_FUNCTION_TYPE (PVOID, PCVOID, BND, ULONG)
+DEF_FUNCTION_TYPE (ULONG, VOID)
+DEF_FUNCTION_TYPE (PVOID, BND)
 
 DEF_FUNCTION_TYPE (V2DI, V2DI, V2DI, UINT, UINT)
 DEF_FUNCTION_TYPE (V4HI, HI, HI, HI, HI)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a427c15..6ddd37a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -64,6 +64,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "context.h"
 #include "pass_manager.h"
+#include "tree-chkp.h"
+#include "rtl-chkp.h"
 
 static rtx legitimize_dllimport_symbol (rtx, bool);
 static rtx legitimize_pe_coff_extern_decl (rtx, bool);
@@ -27742,6 +27744,21 @@ enum ix86_builtins
   IX86_BUILTIN_XABORT,
   IX86_BUILTIN_XTEST,
 
+  /* MPX */
+  IX86_BUILTIN_BNDMK,
+  IX86_BUILTIN_BNDSTX,
+  IX86_BUILTIN_BNDLDX,
+  IX86_BUILTIN_BNDCL,
+  IX86_BUILTIN_BNDCU,
+  IX86_BUILTIN_BNDRET,
+  IX86_BUILTIN_BNDSET,
+  IX86_BUILTIN_BNDNARROW,
+  IX86_BUILTIN_BNDINT,
+  IX86_BUILTIN_ARG_BND,
+  IX86_BUILTIN_SIZEOF,
+  IX86_BUILTIN_BNDLOWER,
+  IX86_BUILTIN_BNDUPPER,
+
   /* BMI instructions.  */
   IX86_BUILTIN_BEXTR32,
   IX86_BUILTIN_BEXTR64,
@@ -27811,6 +27828,8 

Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread H.J. Lu
On Mon, Nov 18, 2013 at 5:57 AM, Arnaud Charlet  wrote:
>> >> struct timespec {
>> >> __kernel_time_t tv_sec; /* seconds */
>> >> longtv_nsec;/* nanoseconds */
>> >> };
>>
>> This is wrong for x32 where tv_nsec is long long, not
>> long.  There are a couple places where long should be
>> long long for x32.
>
> Well yes, but the type is *not* time_t, it's conceptually a different type.
>
>> Glibc gets them right properly:
>
> But that's for timeval, not timespec.

Glibc has

struct timespec
  {
__time_t tv_sec;/* Seconds.  */
__syscall_slong_t tv_nsec;  /* Nanoseconds.  */
  };

__syscall_slong_t is long long for x32.

>> All for Linux architectures, __suseconds_t and
>> __suseconds_t are the same as time_t, including
>> x32.  If not, we can't use array for timeval.
>
> Well this PR is about x32 being different from all other linux ports. There's
> no guarantee that we won't have exactly the same issue on future ports where
> __suseconds_t is *not* the same as time_t.

Ada was using long for time_t and

type timeval is array (1 .. 2) of C.long

It assumes that the type of tv_nsec is the same as tv_sec.

> Also there's the issue of s-osinte-solaris-posix.ads which is broken with
> your change right now.
>
> The short term solution would be to apply this change:
>
> --- s-osinte-solaris-posix.ads  (revision 298928)
> +++ s-osinte-solaris-posix.ads  (working copy)
> @@ -513,7 +513,7 @@
>
> type timespec is record
>tv_sec  : time_t;
> -  tv_nsec : long;
> +  tv_nsec : time_t;
> end record;
> pragma Convention (C, timespec);
>
> But I'd rather settle on which type is appropriate for the tv_nsec
> field before making this change.
>

__syscall_slong_t is a Linux specific type.  We
can add tv_nsec_t, which should be the same
as time_t for all the current targets.

-- 
H.J.


Use anti-ranges in memcpy/memset expansion

2013-11-18 Thread Jan Hubicka
Hi,
this patch adds support to anti-ranges to determine_block_size.
This makes it possible to bound size of block from bellow that is
useful to avoid need for small size code path and it also allows
us to inline the following:
void *a;
void *b;
t(int c)
{
  if (c<10)
memcpy (a,b,c);
}

Because C is signed, we do not really know that the block is smaller
than 10, but it is most likely the case.

Bootstrapped/regtested x86_64-linux OK?

Honza

* md.texi (setmem): Document new parameter.
* optabs.c (maybe_gen_insn): Support 9 operands.
* builtins.c (determine_block_size): Add probable_max_size;
support anti-ranges.
(expand_builtin_memcpy. expand_builtin_memset_args): Pass around
probable_max_size.
* expr.c (emit_block_move_via_movmem, emit_block_move_hints,
emit_block_move, clear_storage_hints, set_storage_via_setmem):
Likewise.
* expr.h (emit_block_move_hints, clear_storage_hints,
set_storage_via_setmem): Update prototype.
* i386.md (setmem, movmem patterns): Add 9th operand.
* i386-protos.h (ix86_expand_set_or_movmem): Update prototype.
* i386.c (ix86_expand_set_or_movmem): Take probable_max_size_exp
argument; pass it to decide_alg.

* gcc.target/i386/memcpy-3.c: New testcase.
Index: doc/md.texi
===
*** doc/md.texi (revision 204945)
--- doc/md.texi (working copy)
*** all cases. This expected alignment is al
*** 5352,5357 
--- 5352,5359 
  Expected size, when unknown, is set to @code{(const_int -1)}.
  Operand 7 is the minimal size of the block and operand 8 is the
  maximal size of the block (NULL if it can not be represented as CONST_INT).
+ Operand 9 is the probable maximal size (i.e. we can not rely on it for 
correctness,
+ but it can be used for choosing proper code sequence for a given size).
  
  The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}.
  
Index: optabs.c
===
*** optabs.c(revision 204945)
--- optabs.c(working copy)
*** maybe_gen_insn (enum insn_code icode, un
*** 8229,8234 
--- 8229,8238 
return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
  ops[3].value, ops[4].value, ops[5].value,
  ops[6].value, ops[7].value);
+ case 9:
+   return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
+ ops[3].value, ops[4].value, ops[5].value,
+ ops[6].value, ops[7].value, ops[8].value);
  }
gcc_unreachable ();
  }
Index: builtins.c
===
*** builtins.c  (revision 204945)
--- builtins.c  (working copy)
*** builtin_memcpy_read_str (void *data, HOS
*** 3096,3107 
  }
  
  /* LEN specify length of the block of memcpy/memset operation.
!Figure out its range and put it into MIN_SIZE/MAX_SIZE.  */
  
  static void
  determine_block_size (tree len, rtx len_rtx,
  unsigned HOST_WIDE_INT *min_size,
! unsigned HOST_WIDE_INT *max_size)
  {
if (CONST_INT_P (len_rtx))
  {
--- 3096,3110 
  }
  
  /* LEN specify length of the block of memcpy/memset operation.
!Figure out its range and put it into MIN_SIZE/MAX_SIZE. 
!In some cases we can make very likely guess on max size, then we
!set it into PROBABLE_MAX_SIZE.  */
  
  static void
  determine_block_size (tree len, rtx len_rtx,
  unsigned HOST_WIDE_INT *min_size,
! unsigned HOST_WIDE_INT *max_size,
! unsigned HOST_WIDE_INT *probable_max_size)
  {
if (CONST_INT_P (len_rtx))
  {
*** determine_block_size (tree len, rtx len_
*** 3111,3138 
else
  {
double_int min, max;
!   if (TREE_CODE (len) == SSA_NAME 
! && get_range_info (len, &min, &max) == VR_RANGE)
{
! if (min.fits_uhwi ())
*min_size = min.to_uhwi ();
! else
!   *min_size = 0;
! if (max.fits_uhwi ())
!   *max_size = max.to_uhwi ();
! else
!   *max_size = (HOST_WIDE_INT)-1;
}
!   else
{
! if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1))
!   *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1);
! else
!   *min_size = 0;
! if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1))
!   *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1);
! else
!   *max_size = GET_MODE_MASK (GET_MODE (len_rtx));
}
  }
gcc_checking_assert (*max_size <=
--- 3114,3160 
else
  {
double_int min, max;
!   enum value_range_type range_type = VR_UNDEFINED;
! 
!   /* Determine bounds from 

Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread Arnaud Charlet
> Ada was using long for time_t and
> 
> type timeval is array (1 .. 2) of C.long
> 
> It assumes that the type of tv_nsec is the same as tv_sec.

Yes, and that was indeed wrong/dangerous.

> > --- s-osinte-solaris-posix.ads  (revision 298928)
> > +++ s-osinte-solaris-posix.ads  (working copy)
> > @@ -513,7 +513,7 @@
> >
> > type timespec is record
> >tv_sec  : time_t;
> > -  tv_nsec : long;
> > +  tv_nsec : time_t;
> > end record;
> > pragma Convention (C, timespec);
> >
> > But I'd rather settle on which type is appropriate for the tv_nsec
> > field before making this change.
> >
> 
> __syscall_slong_t is a Linux specific type.  We
> can add tv_nsec_t, which should be the same
> as time_t for all the current targets.

Introducing tv_nsec_t looks reasonable to me.

Arno


Re: Add value range support into memcpy/memset expansion

2013-11-18 Thread H.J. Lu
On Sun, Nov 17, 2013 at 3:38 PM, Jan Hubicka  wrote:
> Hi,
> this is version I comitted. It also adds a testcase and enables the support 
> in i386 backend.
>
> Honza
>
> * doc/md.texi (setmem, movstr): Update documentation.
> * builtins.c (determine_block_size): New function.
> (expand_builtin_memcpy): Use it and pass it to
> emit_block_move_hints.
> (expand_builtin_memset_args): Use it and pass it to
> set_storage_via_setmem.
> * expr.c (emit_block_move_via_movmem): Add min_size/max_size 
> parameters;
> update call to expander.
> (emit_block_move_hints): Add min_size/max_size parameters.
> (clear_storage_hints): Likewise.
> (set_storage_via_setmem): Likewise.
> (clear_storage): Update.
> * expr.h (emit_block_move_hints, clear_storage_hints,
> set_storage_via_setmem): Update prototype.
> * i386.c (ix86_expand_set_or_movmem): Add bounds; export.
> (ix86_expand_movmem, ix86_expand_setmem): Remove.
> (ix86_expand_movmem, ix86_expand_setmem): Remove.
> * i386.md (movmem, setmem): Pass parameters.
>
> * testsuite/gcc.target/i386/memcpy-2.c: New testcase.

The new testcase fails for me on x86 and x86-64:

FAIL: gcc.target/i386/memcpy-2.c scan-assembler-not memcpy
FAIL: gcc.target/i386/memcpy-2.c (test for excess errors)

-- 
H.J.


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread H.J. Lu
On Mon, Nov 18, 2013 at 6:17 AM, Arnaud Charlet  wrote:

>> >
>> > type timespec is record
>> >tv_sec  : time_t;
>> > -  tv_nsec : long;
>> > +  tv_nsec : time_t;
>> > end record;
>> > pragma Convention (C, timespec);
>> >
>> > But I'd rather settle on which type is appropriate for the tv_nsec
>> > field before making this change.
>> >
>>
>> __syscall_slong_t is a Linux specific type.  We
>> can add tv_nsec_t, which should be the same
>> as time_t for all the current targets.
>
> Introducing tv_nsec_t looks reasonable to me.
>

Can you make the change?

Thanks.


-- 
H.J.


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-18 Thread Arnaud Charlet
> >> __syscall_slong_t is a Linux specific type.  We
> >> can add tv_nsec_t, which should be the same
> >> as time_t for all the current targets.
> >
> > Introducing tv_nsec_t looks reasonable to me.
> >
> 
> Can you make the change?
> 
> Thanks.

Not right now, I have lots of other things to do.

Arno


Re: libsanitizer merge from upstream r191666

2013-11-18 Thread Jakub Jelinek
On Fri, Nov 15, 2013 at 11:10:18AM +0100, Jakub Jelinek wrote:
> 2013-11-15  Jakub Jelinek  
> 
>   * cfgexpand.c (struct stack_vars_data): Add asan_base and asan_alignb
>   fields.
>   (expand_stack_vars): For -fsanitize=address, use (and set initially)
>   data->asan_base as base for vars and update asan_alignb.
>   (expand_used_vars): Initialize data.asan_base and data.asan_alignb.
>   Pass them to asan_emit_stack_protection.
>   * asan.c (asan_detect_stack_use_after_return): New variable.
>   (asan_emit_stack_protection): Add pbase and alignb arguments.
>   Implement use after return sanitization.
>   * asan.h (asan_emit_stack_protection): Adjust prototype.
>   (ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define.

Here is an updated version of that, that applies cleanly against Honza's
latest committed changes (but there are further changes pending AFAIK, so
likely yet another ", sz" needed) and increments also the size of the
bottom (aka left) red zone if alignment 64 bytes or higher is needed, so
that we don't (temporarily??) avoid wasting space and runtime.

2013-11-18  Jakub Jelinek  

* cfgexpand.c (struct stack_vars_data): Add asan_base and asan_alignb
fields.
(expand_stack_vars): For -fsanitize=address, use (and set initially)
data->asan_base as base for vars and update asan_alignb.
(expand_used_vars): Initialize data.asan_base and data.asan_alignb.
Pass them to asan_emit_stack_protection.
* asan.c (asan_detect_stack_use_after_return): New variable.
(asan_emit_stack_protection): Add pbase and alignb arguments.
Implement use after return sanitization.
* asan.h (asan_emit_stack_protection): Adjust prototype.
(ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define.

--- gcc/cfgexpand.c.jj  2013-11-18 09:59:02.772253297 +0100
+++ gcc/cfgexpand.c 2013-11-18 15:28:48.770478076 +0100
@@ -879,6 +879,12 @@ struct stack_vars_data
 
   /* Vector of partition representative decls in between the paddings.  */
   vec asan_decl_vec;
+
+  /* Base pseudo register for Address Sanitizer protected automatic vars.  */
+  rtx asan_base;
+
+  /* Alignment needed for the Address Sanitizer protected automatic vars.  */
+  unsigned int asan_alignb;
 };
 
 /* A subroutine of expand_used_vars.  Give each partition representative
@@ -963,6 +969,7 @@ expand_stack_vars (bool (*pred) (size_t)
   alignb = stack_vars[i].alignb;
   if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
{
+ base = virtual_stack_vars_rtx;
  if ((flag_sanitize & SANITIZE_ADDRESS) && pred)
{
  HOST_WIDE_INT prev_offset = frame_offset;
@@ -991,10 +998,13 @@ expand_stack_vars (bool (*pred) (size_t)
  if (repr_decl == NULL_TREE)
repr_decl = stack_vars[i].decl;
  data->asan_decl_vec.safe_push (repr_decl);
+ data->asan_alignb = MAX (data->asan_alignb, alignb);
+ if (data->asan_base == NULL)
+   data->asan_base = gen_reg_rtx (Pmode);
+ base = data->asan_base;
}
  else
offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
- base = virtual_stack_vars_rtx;
  base_align = crtl->max_used_stack_slot_alignment;
}
   else
@@ -1781,6 +1791,8 @@ expand_used_vars (void)
 
   data.asan_vec = vNULL;
   data.asan_decl_vec = vNULL;
+  data.asan_base = NULL_RTX;
+  data.asan_alignb = 0;
 
   /* Reorder decls to be protected by iterating over the variables
 array multiple times, and allocating out of each phase in turn.  */
@@ -1805,16 +1817,29 @@ expand_used_vars (void)
   if (!data.asan_vec.is_empty ())
{
  HOST_WIDE_INT prev_offset = frame_offset;
- HOST_WIDE_INT offset
-   = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE,
-  ASAN_RED_ZONE_SIZE);
+ HOST_WIDE_INT offset, sz;
+ sz = ASAN_RED_ZONE_SIZE;
+ sz = data.asan_vec[0] - prev_offset;
+ if (data.asan_alignb > ASAN_RED_ZONE_SIZE
+ && data.asan_alignb <= 4096
+ && sz + ASAN_RED_ZONE_SIZE >= data.asan_alignb)
+   {
+ sz = ((sz + ASAN_RED_ZONE_SIZE + data.asan_alignb - 1)
+   & ~(data.asan_alignb - HOST_WIDE_INT_1)) - sz;
+   }
+ else
+   sz = ASAN_RED_ZONE_SIZE;
+ offset
+   = alloc_stack_frame_space (sz, ASAN_RED_ZONE_SIZE);
  data.asan_vec.safe_push (prev_offset);
  data.asan_vec.safe_push (offset);
 
  var_end_seq
= asan_emit_stack_protection (virtual_stack_vars_rtx,
+ data.asan_base,
+ data.asan_alignb,
  data.asan_vec.address (),
-

Re: [PATCH] Implement -fsanitize=null + new sanopt pass

2013-11-18 Thread Marek Polacek
On Mon, Nov 18, 2013 at 02:51:41PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 13, 2013 at 12:13:48AM +0100, Marek Polacek wrote:
> > --- gcc/config/bootstrap-ubsan.mk.mp2013-11-12 13:46:13.345182065 
> > +0100
> > +++ gcc/config/bootstrap-ubsan.mk   2013-11-12 13:46:49.812314016 +0100
> > @@ -2,6 +2,6 @@
> >  
> >  STAGE2_CFLAGS += -fsanitize=undefined
> >  STAGE3_CFLAGS += -fsanitize=undefined
> > -POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread \
> > +POSTSTAGE1_LDFLAGS += -fsanitize=undefined -static-libubsan -lpthread -ldl 
> > \
> 
> Hopefully with my pending patch you can remove the -lpthread -ldl again, but
> ok for now.

Cool.

> > +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > +   {
> > + gimple stmt = gsi_stmt (gsi);
> > +
> > + if (gimple_code (stmt) != GIMPLE_CALL)
> 
> if (is_gimple_call (stmt))

Fixed.
 
> Ok with those changes.

Thanks.  Also I'll have to add some headers after the gimple.h reorg,
but that is an obvious change.

Marek


Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-18 Thread Kyrill Tkachov

On 16/11/13 01:20, Aldy Hernandez wrote:

On 11/15/13 15:05, Aldy Hernandez wrote:

Why all this, and not keep everything but
GF_OMP_FOR_KIND_SIMD and GF_OMP_FOR_KIND_DISTRIBUTE
as they were, and just use:
 GF_OMP_FOR_KIND_DISTIRBUTE = 1 << 0,
 GF_OMP_FOR_KIND_SIMD   = 2 << 0,
 GF_OMP_FOR_KIND_CILKSIMD   = 3 << 0,

Sounds good.  Testing the following patch against my previous patch
which had already been committed.

If there are no regressions, I will commit.

Aldy


Committed.


Hi Aldy,

The testcase c-c++-common/cilk-plus/PS/body.c needs an effective target check 
for fopenmp before adding -fopenmp to its options, otherwise it'll fail on 
bare-metal targets like arm-none-eabi.


This patch adds that.

Ok to commit?

Thanks,
Kyrill

2013-11-18  Kyrylo Tkachov  

* c-c++-common/cilk-plus/PS/body.c: Add fopenmp effective target check.diff --git a/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c
index 9b10041..82c0a0c 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/PS/body.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-fcilkplus -fopenmp" } */
+/* { dg-require-effective-target fopenmp } */
 
 int *a, *b, c;
 void *jmpbuf[10];

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-18 Thread Sergey Ostanevich
I would agree that the example is just for the case cost model makes
correct estimation But how can we assure ourself that it won't have any
mistakes in the future?

I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
dedicated simd-forced vectorization to use unlimited cost model. This
can be default for -fopenmp or there should be a warning issued that
compiler overrides user's request of vectorization. In such a case user
can enforce vectorization (even with mentioned results :) with this
unlimited cost model for simd.



On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener  wrote:
> On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
>
>> Richard,
>>
>> here's an example that causes trigger for the cost model.
>
> I hardly believe that (AVX2)
>
> .L9:
> vmovups (%rsi), %xmm3
> addl$1, %r8d
> addq$256, %rsi
> vinsertf128 $0x1, -240(%rsi), %ymm3, %ymm1
> vmovups -224(%rsi), %xmm3
> vinsertf128 $0x1, -208(%rsi), %ymm3, %ymm3
> vshufps $136, %ymm3, %ymm1, %ymm3
> vperm2f128  $3, %ymm3, %ymm3, %ymm2
> vshufps $68, %ymm2, %ymm3, %ymm1
> vshufps $238, %ymm2, %ymm3, %ymm2
> vmovups -192(%rsi), %xmm3
> vinsertf128 $1, %xmm2, %ymm1, %ymm2
> vinsertf128 $0x1, -176(%rsi), %ymm3, %ymm1
> vmovups -160(%rsi), %xmm3
> vinsertf128 $0x1, -144(%rsi), %ymm3, %ymm3
> vshufps $136, %ymm3, %ymm1, %ymm3
> vperm2f128  $3, %ymm3, %ymm3, %ymm1
> vshufps $68, %ymm1, %ymm3, %ymm4
> vshufps $238, %ymm1, %ymm3, %ymm1
> vmovups -128(%rsi), %xmm3
> vinsertf128 $1, %xmm1, %ymm4, %ymm1
> vshufps $136, %ymm1, %ymm2, %ymm1
> vperm2f128  $3, %ymm1, %ymm1, %ymm2
> vshufps $68, %ymm2, %ymm1, %ymm4
> vshufps $238, %ymm2, %ymm1, %ymm2
> vinsertf128 $0x1, -112(%rsi), %ymm3, %ymm1
> vmovups -96(%rsi), %xmm3
> vinsertf128 $1, %xmm2, %ymm4, %ymm4
> vinsertf128 $0x1, -80(%rsi), %ymm3, %ymm3
> vshufps $136, %ymm3, %ymm1, %ymm3
> vperm2f128  $3, %ymm3, %ymm3, %ymm2
> vshufps $68, %ymm2, %ymm3, %ymm1
> vshufps $238, %ymm2, %ymm3, %ymm2
> vmovups -64(%rsi), %xmm3
> vinsertf128 $1, %xmm2, %ymm1, %ymm2
> vinsertf128 $0x1, -48(%rsi), %ymm3, %ymm1
> vmovups -32(%rsi), %xmm3
> vinsertf128 $0x1, -16(%rsi), %ymm3, %ymm3
> cmpl%r8d, %edi
> vshufps $136, %ymm3, %ymm1, %ymm3
> vperm2f128  $3, %ymm3, %ymm3, %ymm1
> vshufps $68, %ymm1, %ymm3, %ymm5
> vshufps $238, %ymm1, %ymm3, %ymm1
> vinsertf128 $1, %xmm1, %ymm5, %ymm1
> vshufps $136, %ymm1, %ymm2, %ymm1
> vperm2f128  $3, %ymm1, %ymm1, %ymm2
> vshufps $68, %ymm2, %ymm1, %ymm3
> vshufps $238, %ymm2, %ymm1, %ymm2
> vinsertf128 $1, %xmm2, %ymm3, %ymm1
> vshufps $136, %ymm1, %ymm4, %ymm1
> vperm2f128  $3, %ymm1, %ymm1, %ymm2
> vshufps $68, %ymm2, %ymm1, %ymm3
> vshufps $238, %ymm2, %ymm1, %ymm2
> vinsertf128 $1, %xmm2, %ymm3, %ymm2
> vaddps  %ymm2, %ymm0, %ymm0
> ja  .L9
>
> is more efficient than
>
> .L3:
> vaddss  (%rcx,%rax), %xmm0, %xmm0
> addq$32, %rax
> cmpq%rdx, %rax
> jne .L3
>
> ;)
>
>> As soon as
>> elemental functions will appear and we update the vectorizer so it can accept
>> an elemental function inside the loop - we will have the same
>> situation as we have
>> it now: cost model will bail out with profitability estimation.
>
> Yes.
>
>> Still we have no chance to get info on how efficient the bar() function when 
>> it
>> is in vector form.
>
> Well I assume you mean that the speedup when vectorizing the elemental
> will offset whatever wreckage we cause with vectorizing the rest of the
> statements.  I'd say you can at least compare to unrolling by
> the vectorization factor, building the vector inputs to the elemental
> from scalars, distributing the vector result from the elemental to
> scalars.
>
>> I believe I should repeat: #pragma omp simd is intended for introduction of 
>> an
>> instruction-level parallel region on developer's request, hence should
>> be treated
>> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
>> here, not a help.
>
> Surely not if there isn't an elemental call in it.  With it the
> cost model of course will have not enough information to decide.
>
> But still, what's the difference to the case where we cannot vectorize
> the function?  What happens if we cannot vectorize the elemental?
> Do we have to build scalar versions for all possible vector sizes?
>
> Richard.
>
>> Regards,
>> Sergos
>>
>>
>> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener  wrote:
>> > Sergey Ostanevich  wrote:
>> >>this is only for the whole file? I mean to have a particula

Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition

2013-11-18 Thread Teresa Johnson
On Sat, Nov 16, 2013 at 12:33 AM, Jan Hubicka  wrote:
>> When testing with -freorder-blocks-and-partition enabled, I hit a
>> verification failure in an LTO profiledbootstrap. Edge forwarding
>> performed when we went into cfg layout mode after bb reordering
>> (during compgotos) created a situation where a hot block was then
>> dominated by a cold block and was therefore remarked as cold. Because
>> bb reorder was complete at that point, it was not moved in the
>> physical layout, and we incorrectly went in and out of the cold
>> section multiple times.
>>
>> The following patch addresses that by fixing the layout when we move
>> blocks to the cold section after bb reordering is complete.
>>
>> Tested with an LTO profiledbootstrap with
>> -freorder-blocks-and-partition enabled. Ok for trunk?
>>
>> Thanks,
>> Teresa
>>
>> 2013-11-15  Teresa Johnson  
>>
>> * cfgrtl.c (fixup_partitions): Reorder blocks if necessary.
>
> computed_gotos just unfactors unified blocks that we use to avoid CFGs with
> O(n^2) edges. This is mostly to avoid problems with nonlinearity of other 
> passes
> and to reduce the quadratic memory use case to one function at a time.
>
> I wonder if it won't be cleaner to simply unfactor those just before 
> pass_reorder_blocks.
>
> Computed gotos are used e.g. in libjava interpreter to optimize the tight 
> interpretting
> loop.  I think those cases would benefit from having at least 
> scheduling/reordering
> and alignments done right.
>
> Of course it depends on how bad the compile time implications are (I think in 
> addition
> to libjava, there was a lucier's testcase that made us to go for this trick) ,
> but I would prefer it over ading yet another hack into cfgrtl...
> We also may just avoid cfglayout cleanup_cfg while doing computed gotos...

I am testing a new patch right now that simply moves compgotos to just
before bb reordering, and ads an assert to cfg_layout_initialize to
detect when we attempt that after bb reordering. I looked at the other
places that go into cfg layout and I think compgotos is currently the
only one after bb reordering.

>From a bb layout perspective it seems like it would be beneficial to
do compgotos before layout. Was the current position just to try to
reduce compile time by keeping the block unified as long as possible?
For your comment "I think those cases would benefit from having at
least scheduling/reordering and alignments done right." do you mean
that it would be better from a code quality perspective for those to
have compgotos done earlier before those passes? That seems to make
sense to me as well.

I'm doing an lto profiledbootstrap with the change right now. Is there
other testing that should be done for this change? Can you point me at
lucier's testcase that you refer to above? I found that PR15242 was
the bug that inspired adding compgotos, but it is a small test case so
I'm not sure what I will learn from trying that other than whether
compgotos still kicks in as expected.

Thanks,
Teresa

>
> Honza



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: Use anti-ranges in memcpy/memset expansion

2013-11-18 Thread Richard Biener
On Mon, 18 Nov 2013, Jan Hubicka wrote:

> Hi,
> this patch adds support to anti-ranges to determine_block_size.
> This makes it possible to bound size of block from bellow that is
> useful to avoid need for small size code path and it also allows
> us to inline the following:
> void *a;
> void *b;
> t(int c)
> {
>   if (c<10)
> memcpy (a,b,c);
> }
> 
> Because C is signed, we do not really know that the block is smaller
> than 10, but it is most likely the case.
> 
> Bootstrapped/regtested x86_64-linux OK?

Ok.

Thanks,
Richard.

> Honza
> 
>   * md.texi (setmem): Document new parameter.
>   * optabs.c (maybe_gen_insn): Support 9 operands.
>   * builtins.c (determine_block_size): Add probable_max_size;
>   support anti-ranges.
>   (expand_builtin_memcpy. expand_builtin_memset_args): Pass around
>   probable_max_size.
>   * expr.c (emit_block_move_via_movmem, emit_block_move_hints,
>   emit_block_move, clear_storage_hints, set_storage_via_setmem):
>   Likewise.
>   * expr.h (emit_block_move_hints, clear_storage_hints,
>   set_storage_via_setmem): Update prototype.
>   * i386.md (setmem, movmem patterns): Add 9th operand.
>   * i386-protos.h (ix86_expand_set_or_movmem): Update prototype.
>   * i386.c (ix86_expand_set_or_movmem): Take probable_max_size_exp
>   argument; pass it to decide_alg.
> 
>   * gcc.target/i386/memcpy-3.c: New testcase.
> Index: doc/md.texi
> ===
> *** doc/md.texi   (revision 204945)
> --- doc/md.texi   (working copy)
> *** all cases. This expected alignment is al
> *** 5352,5357 
> --- 5352,5359 
>   Expected size, when unknown, is set to @code{(const_int -1)}.
>   Operand 7 is the minimal size of the block and operand 8 is the
>   maximal size of the block (NULL if it can not be represented as CONST_INT).
> + Operand 9 is the probable maximal size (i.e. we can not rely on it for 
> correctness,
> + but it can be used for choosing proper code sequence for a given size).
>   
>   The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}.
>   
> Index: optabs.c
> ===
> *** optabs.c  (revision 204945)
> --- optabs.c  (working copy)
> *** maybe_gen_insn (enum insn_code icode, un
> *** 8229,8234 
> --- 8229,8238 
> return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
> ops[3].value, ops[4].value, ops[5].value,
> ops[6].value, ops[7].value);
> + case 9:
> +   return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
> +   ops[3].value, ops[4].value, ops[5].value,
> +   ops[6].value, ops[7].value, ops[8].value);
>   }
> gcc_unreachable ();
>   }
> Index: builtins.c
> ===
> *** builtins.c(revision 204945)
> --- builtins.c(working copy)
> *** builtin_memcpy_read_str (void *data, HOS
> *** 3096,3107 
>   }
>   
>   /* LEN specify length of the block of memcpy/memset operation.
> !Figure out its range and put it into MIN_SIZE/MAX_SIZE.  */
>   
>   static void
>   determine_block_size (tree len, rtx len_rtx,
> unsigned HOST_WIDE_INT *min_size,
> !   unsigned HOST_WIDE_INT *max_size)
>   {
> if (CONST_INT_P (len_rtx))
>   {
> --- 3096,3110 
>   }
>   
>   /* LEN specify length of the block of memcpy/memset operation.
> !Figure out its range and put it into MIN_SIZE/MAX_SIZE. 
> !In some cases we can make very likely guess on max size, then we
> !set it into PROBABLE_MAX_SIZE.  */
>   
>   static void
>   determine_block_size (tree len, rtx len_rtx,
> unsigned HOST_WIDE_INT *min_size,
> !   unsigned HOST_WIDE_INT *max_size,
> !   unsigned HOST_WIDE_INT *probable_max_size)
>   {
> if (CONST_INT_P (len_rtx))
>   {
> *** determine_block_size (tree len, rtx len_
> *** 3111,3138 
> else
>   {
> double_int min, max;
> !   if (TREE_CODE (len) == SSA_NAME 
> !   && get_range_info (len, &min, &max) == VR_RANGE)
>   {
> !   if (min.fits_uhwi ())
>   *min_size = min.to_uhwi ();
> !   else
> ! *min_size = 0;
> !   if (max.fits_uhwi ())
> ! *max_size = max.to_uhwi ();
> !   else
> ! *max_size = (HOST_WIDE_INT)-1;
>   }
> !   else
>   {
> !   if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1))
> ! *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1);
> !   else
> ! *min_size = 0;
> !   if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1))
> ! *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1);
> !   else
> ! *max_siz

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-18 Thread Richard Biener
On Mon, 18 Nov 2013, Sergey Ostanevich wrote:

> I would agree that the example is just for the case cost model makes
> correct estimation But how can we assure ourself that it won't have any
> mistakes in the future?

We call it bugs and not mistakes and we have bugzilla for it.

Richard.

> I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
> dedicated simd-forced vectorization to use unlimited cost model. This
> can be default for -fopenmp or there should be a warning issued that
> compiler overrides user's request of vectorization. In such a case user
> can enforce vectorization (even with mentioned results :) with this
> unlimited cost model for simd.
> 
> 
> 
> On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener  wrote:
> > On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
> >
> >> Richard,
> >>
> >> here's an example that causes trigger for the cost model.
> >
> > I hardly believe that (AVX2)
> >
> > .L9:
> > vmovups (%rsi), %xmm3
> > addl$1, %r8d
> > addq$256, %rsi
> > vinsertf128 $0x1, -240(%rsi), %ymm3, %ymm1
> > vmovups -224(%rsi), %xmm3
> > vinsertf128 $0x1, -208(%rsi), %ymm3, %ymm3
> > vshufps $136, %ymm3, %ymm1, %ymm3
> > vperm2f128  $3, %ymm3, %ymm3, %ymm2
> > vshufps $68, %ymm2, %ymm3, %ymm1
> > vshufps $238, %ymm2, %ymm3, %ymm2
> > vmovups -192(%rsi), %xmm3
> > vinsertf128 $1, %xmm2, %ymm1, %ymm2
> > vinsertf128 $0x1, -176(%rsi), %ymm3, %ymm1
> > vmovups -160(%rsi), %xmm3
> > vinsertf128 $0x1, -144(%rsi), %ymm3, %ymm3
> > vshufps $136, %ymm3, %ymm1, %ymm3
> > vperm2f128  $3, %ymm3, %ymm3, %ymm1
> > vshufps $68, %ymm1, %ymm3, %ymm4
> > vshufps $238, %ymm1, %ymm3, %ymm1
> > vmovups -128(%rsi), %xmm3
> > vinsertf128 $1, %xmm1, %ymm4, %ymm1
> > vshufps $136, %ymm1, %ymm2, %ymm1
> > vperm2f128  $3, %ymm1, %ymm1, %ymm2
> > vshufps $68, %ymm2, %ymm1, %ymm4
> > vshufps $238, %ymm2, %ymm1, %ymm2
> > vinsertf128 $0x1, -112(%rsi), %ymm3, %ymm1
> > vmovups -96(%rsi), %xmm3
> > vinsertf128 $1, %xmm2, %ymm4, %ymm4
> > vinsertf128 $0x1, -80(%rsi), %ymm3, %ymm3
> > vshufps $136, %ymm3, %ymm1, %ymm3
> > vperm2f128  $3, %ymm3, %ymm3, %ymm2
> > vshufps $68, %ymm2, %ymm3, %ymm1
> > vshufps $238, %ymm2, %ymm3, %ymm2
> > vmovups -64(%rsi), %xmm3
> > vinsertf128 $1, %xmm2, %ymm1, %ymm2
> > vinsertf128 $0x1, -48(%rsi), %ymm3, %ymm1
> > vmovups -32(%rsi), %xmm3
> > vinsertf128 $0x1, -16(%rsi), %ymm3, %ymm3
> > cmpl%r8d, %edi
> > vshufps $136, %ymm3, %ymm1, %ymm3
> > vperm2f128  $3, %ymm3, %ymm3, %ymm1
> > vshufps $68, %ymm1, %ymm3, %ymm5
> > vshufps $238, %ymm1, %ymm3, %ymm1
> > vinsertf128 $1, %xmm1, %ymm5, %ymm1
> > vshufps $136, %ymm1, %ymm2, %ymm1
> > vperm2f128  $3, %ymm1, %ymm1, %ymm2
> > vshufps $68, %ymm2, %ymm1, %ymm3
> > vshufps $238, %ymm2, %ymm1, %ymm2
> > vinsertf128 $1, %xmm2, %ymm3, %ymm1
> > vshufps $136, %ymm1, %ymm4, %ymm1
> > vperm2f128  $3, %ymm1, %ymm1, %ymm2
> > vshufps $68, %ymm2, %ymm1, %ymm3
> > vshufps $238, %ymm2, %ymm1, %ymm2
> > vinsertf128 $1, %xmm2, %ymm3, %ymm2
> > vaddps  %ymm2, %ymm0, %ymm0
> > ja  .L9
> >
> > is more efficient than
> >
> > .L3:
> > vaddss  (%rcx,%rax), %xmm0, %xmm0
> > addq$32, %rax
> > cmpq%rdx, %rax
> > jne .L3
> >
> > ;)
> >
> >> As soon as
> >> elemental functions will appear and we update the vectorizer so it can 
> >> accept
> >> an elemental function inside the loop - we will have the same
> >> situation as we have
> >> it now: cost model will bail out with profitability estimation.
> >
> > Yes.
> >
> >> Still we have no chance to get info on how efficient the bar() function 
> >> when it
> >> is in vector form.
> >
> > Well I assume you mean that the speedup when vectorizing the elemental
> > will offset whatever wreckage we cause with vectorizing the rest of the
> > statements.  I'd say you can at least compare to unrolling by
> > the vectorization factor, building the vector inputs to the elemental
> > from scalars, distributing the vector result from the elemental to
> > scalars.
> >
> >> I believe I should repeat: #pragma omp simd is intended for introduction 
> >> of an
> >> instruction-level parallel region on developer's request, hence should
> >> be treated
> >> in same manner as #pragma omp parallel. Vectorizer cost model is an 
> >> obstacle
> >> here, not a help.
> >
> > Surely not if there isn't an elemental call in it.  With it the
> > cost model of course will have not enough information to decide.
> >
> > But still, what's the difference to

Minor cgraph cleanups

2013-11-18 Thread Bernd Schmidt
This fixes some minor ugliness I noticed while looking at this code. It
renames a poorly named global variable (shadowed by some function args
in the same file), replaces a cast of an integer value to a pointer, and
removes an undocumented and unused return value.

Bootstrapped and tested on x86_64-linux, ok?


Bernd
	* cgraphunit.c (symtab_terminator): New variable.
	(queued_nodes): Renamed from first.  Use symtab_terminator as
	initializer.
	(analyze_functions): Adjust accordingly.
	(cgraph_process_new_functions): Return void.
	* cgraph.h (cgraph_process_new_functions): Adjust declaration.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 1ac6dfb..0d59a7a 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -737,7 +737,7 @@ void cgraph_finalize_function (tree, bool);
 void finalize_compilation_unit (void);
 void compile (void);
 void init_cgraph (void);
-bool cgraph_process_new_functions (void);
+void cgraph_process_new_functions (void);
 void cgraph_process_same_body_aliases (void);
 void fixup_same_cpp_alias_visibility (symtab_node *, symtab_node *target, tree);
 /*  Initialize datastructures so DECL is a function in lowered gimple form.
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 8ab274b..50bc7d3 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -264,11 +264,13 @@ decide_is_symbol_needed (symtab_node *node)
   return false;
 }
 
-/* Head of the queue of nodes to be processed while building callgraph */
+/* Head and terminator of the queue of nodes to be processed while building
+   callgraph.  */
 
-static symtab_node *first = (symtab_node *)(void *)1;
+static symtab_node symtab_terminator;
+static symtab_node *queued_nodes = &symtab_terminator;
 
-/* Add NODE to queue starting at FIRST. 
+/* Add NODE to queue starting at QUEUED_NODES. 
The queue is linked via AUX pointers and terminated by pointer to 1.  */
 
 static void
@@ -276,25 +278,24 @@ enqueue_node (symtab_node *node)
 {
   if (node->aux)
 return;
-  gcc_checking_assert (first);
-  node->aux = first;
-  first = node;
+  gcc_checking_assert (queued_nodes);
+  node->aux = queued_nodes;
+  queued_nodes = node;
 }
 
 /* Process CGRAPH_NEW_FUNCTIONS and perform actions necessary to add these
functions into callgraph in a way so they look like ordinary reachable
functions inserted into callgraph already at construction time.  */
 
-bool
+void
 cgraph_process_new_functions (void)
 {
-  bool output = false;
   tree fndecl;
   struct cgraph_node *node;
   cgraph_node_set_iterator csi;
 
   if (!cgraph_new_nodes)
-return false;
+return;
   handle_alias_pairs ();
   /*  Note that this queue may grow as its being processed, as the new
   functions may generate new ones.  */
@@ -309,7 +310,6 @@ cgraph_process_new_functions (void)
 	 it into reachable functions list.  */
 
 	  cgraph_finalize_function (fndecl, false);
-	  output = true;
   cgraph_call_function_insertion_hooks (node);
 	  enqueue_node (node);
 	  break;
@@ -350,7 +350,6 @@ cgraph_process_new_functions (void)
 }
   free_cgraph_node_set (cgraph_new_nodes);
   cgraph_new_nodes = NULL;
-  return output;
 }
 
 /* As an GCC extension we allow redefinition of the function.  The
@@ -980,11 +979,11 @@ analyze_functions (void)
 
   /* Lower representation, build callgraph edges and references for all trivially
  needed symbols and all symbols referred by them.  */
-  while (first != (symtab_node *)(void *)1)
+  while (queued_nodes != &symtab_terminator)
 	{
 	  changed = true;
-	  node = first;
-	  first = (symtab_node *)first->aux;
+	  node = queued_nodes;
+	  queued_nodes = (symtab_node *)queued_nodes->aux;
 	  cgraph_node *cnode = dyn_cast  (node);
 	  if (cnode && cnode->definition)
 	{


[C++ testcase, committed] PR 53473

2013-11-18 Thread Paolo Carlini

Hi,

tested x86_64-linux, committed to mainline.

Paolo.

/
2013-11-18  Paolo Carlini  

PR c++/53473
* g++.dg/cpp0x/constexpr-noexcept7.C: New.
Index: g++.dg/cpp0x/constexpr-noexcept7.C
===
--- g++.dg/cpp0x/constexpr-noexcept7.C  (revision 0)
+++ g++.dg/cpp0x/constexpr-noexcept7.C  (working copy)
@@ -0,0 +1,9 @@
+// PR c++/53473
+// { dg-do compile { target c++11 } }
+
+template struct A
+{
+  static constexpr T foo() noexcept { return 0; }
+};
+
+template<> constexpr int A::foo() noexcept { return 0; }


Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition

2013-11-18 Thread Steven Bosscher
On Mon, Nov 18, 2013 at 3:53 PM, Teresa Johnson wrote:
> From a bb layout perspective it seems like it would be beneficial to
> do compgotos before layout. Was the current position just to try to
> reduce compile time by keeping the block unified as long as possible?

It was more a hack that got out of hand. Apparently it hurt some
interpreters (branch prediction!) when the unified computed goto is
not "unfactored". There was a PR for this, and the unfactoring code I
added only fixed part of the problem.


> For your comment "I think those cases would benefit from having at
> least scheduling/reordering and alignments done right." do you mean
> that it would be better from a code quality perspective for those to
> have compgotos done earlier before those passes? That seems to make
> sense to me as well.

Theoretically: Yes, perhaps. In practice there isn't much to gain.
Unfactoring before bb-reorder is probably the most helpful thing, to
get better results for basic block alignment and placement. But
scheduling punts on computed gotos (or explodes in some non-linear
bottleneck).

What used to help is profile-based branch speculation, i.e.

if (*target_addr == most_frequent_target_addr)
  goto most_frequent_target_add;
else
  goto *target_addr;

But I'm not sure if our value profiling based optimizations still have
this case.


> I'm doing an lto profiledbootstrap with the change right now. Is there
> other testing that should be done for this change? Can you point me at
> lucier's testcase that you refer to above? I found that PR15242 was
> the bug that inspired adding compgotos, but it is a small test case so
> I'm not sure what I will learn from trying that other than whether
> compgotos still kicks in as expected.

ISTR it's http://gcc.gnu.org/PR26854

Ciao!
Steven


[PATCH, i386, MPX, 2/X] Pointers Checker [23/25] Target hooks

2013-11-18 Thread Ilya Enkovich
Hi,

Here is a patch with i386 version of Pointer Bounds Checker hooks.

Thanks,
Ilya
--
2013-11-15  Ilya Enkovich  

* config/i386/i386.c (ix86_builtin_mpx_function): New.
(ix86_load_bounds): New.
(ix86_store_bounds): New.
(ix86_fn_abi_va_list_bounds_size): New.
(ix86_mpx_bound_mode): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6ddd37a..50355f1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -34044,6 +34044,162 @@ addcarryx:
   gcc_unreachable ();
 }
 
+/* Return function decl for target specific builtin
+   for given MPX builtin passed i FCODE.  */
+static tree
+ix86_builtin_mpx_function (unsigned fcode)
+{
+  switch (fcode)
+{
+case BUILT_IN_CHKP_BNDMK:
+  return ix86_builtins[IX86_BUILTIN_BNDMK];
+
+case BUILT_IN_CHKP_BNDSTX:
+  return ix86_builtins[IX86_BUILTIN_BNDSTX];
+
+case BUILT_IN_CHKP_BNDLDX:
+  return ix86_builtins[IX86_BUILTIN_BNDLDX];
+
+case BUILT_IN_CHKP_BNDCL:
+  return ix86_builtins[IX86_BUILTIN_BNDCL];
+
+case BUILT_IN_CHKP_BNDCU:
+  return ix86_builtins[IX86_BUILTIN_BNDCU];
+
+case BUILT_IN_CHKP_BNDRET:
+  return ix86_builtins[IX86_BUILTIN_BNDRET];
+
+case BUILT_IN_CHKP_INTERSECT:
+  return ix86_builtins[IX86_BUILTIN_BNDINT];
+
+case BUILT_IN_CHKP_SET_PTR_BOUNDS:
+  return ix86_builtins[IX86_BUILTIN_BNDSET];
+
+case BUILT_IN_CHKP_NARROW:
+  return ix86_builtins[IX86_BUILTIN_BNDNARROW];
+
+case BUILT_IN_CHKP_ARG_BND:
+  return ix86_builtins[IX86_BUILTIN_ARG_BND];
+
+case BUILT_IN_CHKP_SIZEOF:
+  return ix86_builtins[IX86_BUILTIN_SIZEOF];
+
+case BUILT_IN_CHKP_EXTRACT_LOWER:
+  return ix86_builtins[IX86_BUILTIN_BNDLOWER];
+
+case BUILT_IN_CHKP_EXTRACT_UPPER:
+  return ix86_builtins[IX86_BUILTIN_BNDUPPER];
+
+default:
+  return NULL_TREE;
+}
+
+  gcc_unreachable ();
+}
+
+/* Load bounds PTR pointer value loaded from SLOT.
+   if SLOT is a register then load bounds associated
+   with special address identified by BND.
+
+   Return loaded bounds.  */
+static rtx
+ix86_load_bounds (rtx slot, rtx ptr, rtx bnd)
+{
+  rtx addr = NULL;
+  rtx reg;
+
+  if (REG_P (slot))
+{
+  ptr = slot;
+
+  /* We do not expect non register bounds for register
+parameters other than R8 and R9.  */
+  gcc_assert (REGNO (ptr) == R8_REG || REGNO (ptr) == R9_REG);
+  gcc_assert (bnd == const1_rtx || bnd == const2_rtx);
+
+  /* Here we have the case when more than five pointers are
+passed on registers.  In this case we are out of bound
+registers and have to use bndldx to load bound.  RA and
+RA - 8 are used for address translation in bndldx.  */
+  if (bnd == const1_rtx)
+   addr = plus_constant (Pmode, arg_pointer_rtx, -8);
+  else
+   addr = plus_constant (Pmode, arg_pointer_rtx, -16);
+}
+  else if (MEM_P (slot))
+{
+  if (!ptr)
+   ptr = copy_to_mode_reg (Pmode, slot);
+  addr = XEXP (slot, 0);
+  addr = force_reg (Pmode, addr);
+}
+  else
+gcc_unreachable ();
+
+  ptr = force_reg (Pmode, ptr);
+  /* If ptr was a register originally then it may have
+ mode other than Pmode.  We need to extend in such
+ case because bndldx may work only with Pmode regs.  */
+  if (GET_MODE (ptr) != Pmode)
+{
+  rtx ext = gen_rtx_ZERO_EXTEND (Pmode, ptr);
+  ptr = gen_reg_rtx (Pmode);
+  emit_move_insn (ptr, ext);
+}
+
+  reg = gen_reg_rtx (BNDmode);
+  emit_insn (TARGET_64BIT
+? gen_bnd64_ldx (reg, addr, ptr)
+: gen_bnd32_ldx (reg, addr, ptr));
+
+  return reg;
+}
+
+/* Store bounds BOUNDS for PTR pointer value stored in
+   specified ADDR.  If ADDR is a register then TO identifies
+   which special address to use for bounds store.  */
+static void
+ix86_store_bounds (rtx ptr, rtx addr, rtx bounds, rtx to)
+{
+  if (REG_P (addr))
+{
+  /* Non register bounds comes only for parameters in
+R8 and R9.  */
+  gcc_assert (REGNO (addr) == R8_REG || REGNO (addr) == R9_REG);
+  gcc_assert (to == const1_rtx || to == const2_rtx);
+
+  if (to == const1_rtx)
+   addr = plus_constant (Pmode, stack_pointer_rtx, -8);
+  else
+   addr = plus_constant (Pmode, stack_pointer_rtx, -16);
+}
+  else if (MEM_P (addr))
+addr = XEXP (addr, 0);
+  else
+gcc_unreachable ();
+
+  /* Should we also ignore integer modes of incorrect size?.  */
+  ptr = force_reg (Pmode, ptr);
+  addr = force_reg (Pmode, addr);
+
+  /* Avoid registers which connot be used as index.  */
+  if (REGNO (ptr) == VIRTUAL_INCOMING_ARGS_REGNUM
+  || REGNO (ptr) == VIRTUAL_STACK_VARS_REGNUM
+  || REGNO (ptr) == VIRTUAL_OUTGOING_ARGS_REGNUM)
+{

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-18 Thread Aldy Hernandez



The testcase c-c++-common/cilk-plus/PS/body.c needs an effective target
check for fopenmp before adding -fopenmp to its options, otherwise it'll
fail on bare-metal targets like arm-none-eabi.

This patch adds that.

Ok to commit?

Thanks,
Kyrill

2013-11-18  Kyrylo Tkachov  

 * c-c++-common/cilk-plus/PS/body.c: Add fopenmp effective target check.


You can commit this as obvious.

Thanks.


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Alexey Samsonov
(resending in plain text)

On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov  wrote:
 Hi Jakub,

 Unfortunately, recently there were a few enhancements to
 sanitizer_symbolizer_posix_libcdep.cc (and friends),
 in LLVM trunk, and now it looks different from gcc version (apparently, the
 changes were committed
 after the merge to gcc happened, I should have pointed this out earlier,
 sorry).

 Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise
 this patch can't go in ASan runtime
 in gcc - the code will significantly diverge.

> On Mon, Nov 18, 2013 at 5:39 PM, Jakub Jelinek  wrote:
>>
>> Hi!
>>
>> This patch adds libbacktrace symbolizer to libsanitizer, with which we
>> can avoid spawning and running external an external program (especially
>> when
>> it is not even provided by gcc itself).
>>
>> I've kept the possibility to override the symbolizer by magic symbols
>> (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc.
>> works and what is the possibility there to add libbacktrace, this just
>> requires users to compile with SANITIZE_LIBBACKTRACE defined to signal
>> that backtrace-supported.h and backtrace.h is available and the
>> sanitizer_symbolizer_posix_libcdep.cc source then decides based on
>> backtrace-supported.h etc. whether it is usable.
>>
>> make check RUNTESTFLAGS='asan.exp ubsan.exp' passes with this.
>>
>> Some pending issues on the libbacktrace side:
>> 1) right now libbacktrace performs dl_iterate_phdr only the first time
>>backtrace_pcinfo or backtrace_syminfo is called, if there are some
>>dlopens/dlcloses in between that and another querying of the
>> symbolizer,
>>it won't notice that.  Perhaps it can be done only when we don't
>>find a symbol and/or have some function that tries to dl_iterare_phdr
>>again, look at cached st_ino/st_mtime or similar, and for threaded
>>version likely just keep old records, just add a flag to them that they
>>should be ignored (or say atomically decrease symbol count to zero
>>and something similar for debug info).
>> 2) for tsan querying of data symbols, apparently the classes want to see
>>not just the symbol name and start value, but also size.  libbacktrace
>>has all this info available, just doesn't pass it down to the callback.
>>I wonder if we'd need to create yet another libbacktrace entrypoint, or
>>if it would be acceptable to do source code incompatible, ABI (at least
>>on all sane targets) compatible version of just adding another
>>uintptr_t symsize argument to backtrace_syminfo_callback.
>> 3) I wonder if libbacktrace couldn't be updated to use __atomic_*
>> builtins,
>>then it could avoid the ugliness to emulate atomic loads and stores.
>>
>> As for sanitizer, the reason I haven't implemented SendCommand method for
>> the libbacktrace symbolizer is that the library doesn't provide the info
>> as text, but as individual values passed to the callback, so printing
>> that to text and then parsing the text would be very ugly.  libbacktrace
>> doesn't even need the module names and module offsets,
>
>
If libbacktrace does its own call of dl_iterate_phdr, then it doesn't need
many
pieces of existing Symbolizer (or POSIXSymbolizer, in new version of source
code)
complexity - like FindModuleForAddress() function. Consider creating a
separate
wrapper class for libbacktrace functionality and returning it from
Symbolizer::PlatformInit
factory (in the new version of code) instead of POSIXSymbolizer.
>
>>
>> so supposedly
>> we would need that only if libbacktrace failed to get accurate inline/call
>> or symbol info.  While the classes have both symbol name/file/line etc.
>> and module name/offset fields, apparently the latter are unused if the
>> former is filled in.
>>
>> 2013-11-18  Jakub Jelinek  
>>
>> PR sanitizer/59136
>> * configure.ac: Don't add target-libbacktrace to noconfigdirs
>> just because go hasn't been enabled if target-libsanitizer isn't
>> in
>> noconfigdirs.
>> * Makefile.def: Add configure-target-libsanitizer dependency on
>> configure-target-libbacktrace and all-target-libsanitizer
>> dependency
>> on configure-target-libsanitizer.
>> * configure: Regenerated.
>> libsanitizer/
>> * sanitizer_common/Makefile.am (AM_CPPFLAGS): If
>> SANITIZER_LIBBACKTRACE, append -I for libbacktrace headers and
>> -DSANITIZER_LIBBACKTRACE.
>> * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Add
>> libbacktrace symbolizer.
>> * tsan/Makefile.am (libtsan_la_LIBADD): Add libbacktrace.la if
>> SANITIZER_LIBBACKTRACE.
>> * asan/Makefile.am (libasan_la_LIBADD): Likewise.
>> * ubsan/Makefile.am (libubsan_la_LIBADD): Likewise.
>> * configure.ac (SANITIZER_LIBBACKTRACE): New AM_CONDITIONAL.
>> * sanitizer_common/Makefile.in: Regenerated.
>> * tsan/Makefile.in: Regenrated.
>> * a

Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p

2013-11-18 Thread Kyrill Tkachov

On 18/11/13 09:14, Kyrill Tkachov wrote:

On 18/11/13 08:37, Yvan Roux wrote:

So, the validation is ok with this patch, I'm just not able to say if
the original performance issue is still fixed with it.  Could you
check it Kyrylo ?

Hi Yvan,
I'll run the benchmark today to confirm the performance,


yep, all good performance-wise :)

Thanks,
Kyrill



but from compiling some
code sequences that exhibited the bad behaviour in the past, I see that this
patch still fixes the issues. store_minmaxsi is not generated when optimising
for speed.


Thanks,
Kyrill







Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 07:49:03PM +0400, Alexey Samsonov wrote:
> > As for sanitizer, the reason I haven't implemented SendCommand method for
> > the libbacktrace symbolizer is that the library doesn't provide the info
> > as text, but as individual values passed to the callback, so printing
> > that to text and then parsing the text would be very ugly.  libbacktrace
> > doesn't even need the module names and module offsets,
> 
> 
> If libbacktrace does its own call of dl_iterate_phdr, then it doesn't need
> many

It does.

> pieces of existing Symbolizer (or POSIXSymbolizer, in new version of source
> code)
> complexity - like FindModuleForAddress() function. Consider creating a
> separate

Well, as fallback it still needs that.  libbacktrace APIs won't tell you:
"ok, see, I don't know exactly what this symbol is, but it is in DSO
libfoobar.so plus offset 0x123456".  It tells you either
function name and file/line (for backtrace_pcinfo callback, possibly several
times to iterate through inline frames) or symbol name and symbol value
(hopefully in the future also symbol size) for backtrace_syminfo callback.
If the address isn't found in debug info resp. symbol tables, you just get
NULL function name of symbol value, NULL filename and 0 line number or
symbol value.  Initially I had the code as a different toplevel symbolizer,
but for the case where libbacktrace isn't there, or where it isn't usable
(due to host limitations e.g. not thread safe, or using malloc, etc.)
there needs to be a fallback to the POSIXSymbolizer.

Jakub


Re: [PATCH] S/390: More htm testcases plus some fixes

2013-11-18 Thread Peter Bergner
On Mon, 2013-11-18 at 10:05 +0100, Andreas Krebbel wrote:
> With the patch the htm-nofloat-2 testcase fails.  Due to the
> "returns_twice" flag on tbegin the optimizers fail to fold the
> compares of the condition code and the s390_optimize_nonescaping_tx
> routine in turn fails to optimize the simple transactions.  This will
> hopefully be fixed with a follow-on patch.

Hi Andreas,

I assume you're using the returns_twice attribute on your tbegin builtin
so that the compiler will help you with the handling of the floating
point registers since they are not restored on the s390's transaction
failure?  We don't have that attribute set on POWER's tbegin builtin
and I don't think we should since all of our registers are restored
on a transaction failure, but I'd like to know if you added that
attribute for any other reason such that POWER should have it too?

Peter




Re: [PATCH] Implement -fsanitize=null + new sanopt pass

2013-11-18 Thread Marek Polacek
On Mon, Nov 18, 2013 at 02:52:34PM +0100, Jakub Jelinek wrote:
> Oh, one more thing, please update gcc/doc/, the -fsanitize= description is
> far from up to date there.

Ok, the following (incremental) hopefully improves the docs.  Joseph, would
you mind having a look at this?  Thanks,

2013-11-18  Marek Polacek  

* doc/invoke.texi: Extend -fsanitize=undefined documentation.

--- gcc/doc/invoke.texi.mp3 2013-11-18 15:57:47.104103101 +0100
+++ gcc/doc/invoke.texi 2013-11-18 17:08:51.305594441 +0100
@@ -5260,9 +5260,45 @@ data race bugs.
 See @uref{http://code.google.com/p/data-race-test/wiki/ThreadSanitizer} for 
more details.
 
 @item -fsanitize=undefined
-Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector
+Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector.
 Various computations will be instrumented to detect undefined behavior
-at runtime, e.g.@: division by zero or various overflows.
+at runtime.  Current suboptions are:
+
+@itemize @bullet
+
+@item @option{-fsanitize=shift}
+
+This option enables checking that the result of a shift operation is
+not undefined.  Note that what exactly is considered undefined differs
+slightly between C and C++, as well as between ANSI C and C99, etc.
+
+@item @option{-fsanitize=integer-divide-by-zero}
+
+Detect integer division by zero as well as @code{INT_MIN / -1} division.
+Note that the latter is only made undefined from C99 onwards.
+
+@item @option{-fsanitize=unreachable}
+
+With this option, the compiler will turn the @code{__builtin_unreachable}
+call into a diagnostics message call instead.  When reaching the
+@code{__builtin_unreachable} call, the behavior is undefined.
+
+@item @option{-fsanitize=vla-bound}
+
+This option instructs the compiler to check that the size of a variable
+length array is positive.  This option does not have any effect in
+@option{-std=c++1y} mode, as the standard requires the exception be thrown
+instead.
+
+@item @option{-fsanitize=null}
+
+This option enables pointer checking.  Particularly, the application
+built with this option turned on will issue an error message when it
+tries to dereference a NULL pointer, or if a reference (possibly an
+rvalue reference) is bound to a NULL pointer.
+
+@end itemize
+
 While @option{-ftrapv} causes traps for signed overflows to be emitted,
 @option{-fsanitize=undefined} gives a diagnostic message.
 This currently works only for the C family of languages.

Marek


[WWWDOCS] Document IPA/LTO/FDO/i386 changes in GCC-4.9

2013-11-18 Thread Jan Hubicka
Hi,
there was many changes in this area.  The following are ones I can think of.
Please fell free to suggest more changes.
We probably should mention Teresa's splitting work once it is complete
and new micro-architectures targetd by x86 backend.

Honza

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.36
diff -u -r1.36 changes.html
--- changes.html15 Nov 2013 15:40:00 -  1.36
+++ changes.html18 Nov 2013 16:15:32 -
@@ -37,14 +37,52 @@
   
 AddressSanitizer, a fast memory error detector, is now available on 
ARM.
 
-  
-  
 UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior detector,
 has been added and can be enabled via 
-fsanitize=undefined.
Various computations will be instrumented to detect undefined behavior
at runtime.  UndefinedBehaviorSanitizer is currently available for the C
and C++ languages.
 
+Link-time optimization (LTO) improvements:
+
+  Type merging was rewritten. New implementation is significantly 
faster
+ and use less memory. 
+  Better partitioning algorithm resulting in less streaming during
+ link-time.
+  Early removal of virtual methods reduce size of object files and
+ improve link-time memory usage and compile time.
+  Functions are no longer pointlessly renamed.
+  Function bodies are now loaded on-demand and released early improving
+ overall memory usage at link-time.
+  C++ hidden keyed methods can now be optimized out.
+
+Memory usage of Firefox build with debug enabled was reduced from 15GB to
+3.5GB. Link time from 1700 seconds to 350 seconds.
+
+Inter-procedural optimization improvements:
+
+  New type inheritance analysis module improving devirtualization.
+ Devirtualization now take into account anonymous name-spaces and the
+ C++11 final keyword.
+  New speculative devirtualization pass (controlled by
+ -fdevirtualize-speculatively.
+  Calls that was speculatively made direct are turned back to indirect
+ when doing so does not bring any noticeable benefits.
+  Local aliases are introduced for symbols that are known to be
+ semantically equivalent across shared libraries improving dynamic
+ linking times.
+
+Feedback directed optimization improvements:
+
+  Profiling of programs using C++ inline functions is now more 
reliable.
+  New time profiling determine typical order in which functions are 
executed.
+  New function reordering pass (controlled by
+ -freorder-functions) significantly reduces
+ startup time of large applications.  Until binutils support is
+ completed, it is effective only with link time optimization.
+  Feedback driven indirect call removal and devirtualization now handle
+ cross-module calls when link-time optimization is enabled.
+
   
 
 New Languages and Language specific improvements
@@ -325,9 +363,20 @@
   href="http://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html";
   >Function Multiversioning.
 
- GCC now supports the new Intel microarchitecture named Silvermont
+GCC now supports the new Intel microarchitecture named Silvermont
   through -march=slm.
 
+-march=generic has been retuned for better support of
+  Intel core and AMD Bulldozer architectures.  Performance of AMD K7, K8,
+  Intel Pentium-M, and Pentium4 based CPUs is no longer considered 
important
+  for generic.
+
+Better inlining of memcpy and memset 
+   that is avare of value ranges and produce shorter alignment prologues.
+
+-mno-accumulate-outgoing-args is now honored when unwind
+  information is output.  Argument accumulation is also now turned off
+  for portions of program optimized for size.
   
 NDS32
   


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 07:50:33PM +0400, Alexey Samsonov wrote:
> On Mon, Nov 18, 2013 at 7:49 PM, Alexey Samsonov  wrote:
>  Unfortunately, recently there were a few enhancements to
>  sanitizer_symbolizer_posix_libcdep.cc (and friends),
>  in LLVM trunk, and now it looks different from gcc version (apparently, the
>  changes were committed
>  after the merge to gcc happened, I should have pointed this out earlier,
>  sorry).
> 
>  Kostya (or Jakub), is it possible to somehow pick up the changes? Otherwise
>  this patch can't go in ASan runtime
>  in gcc - the code will significantly diverge.

Here is an (untested) forward port of the patch to what is in llvm svn right
now.  The unpatched code always had either internal_symbolizer_ != NULL, or
external_symbolizer_ != NULL, or both NULL, but not both set.  The patch
just adds a third variant, libbacktrace_symbolizer_, again, at most one
of the 3 will be non-NULL, and the priorities are that internal_symbolizer_
has highest priority, then libbacktrace (if available/usable), then the
external one.

--- sanitizer_symbolizer_posix_libcdep.cc.jj2013-11-12 19:35:30.0 
+0100
+++ sanitizer_symbolizer_posix_libcdep.cc   2013-11-18 17:16:03.202643957 
+0100
@@ -27,6 +27,16 @@
 #include 
 #include 
 
+#ifdef SANITIZER_LIBBACKTRACE
+#include "backtrace-supported.h"
+#if SANITIZER_LINUX && BACKTRACE_SUPPORTED \
+&& !BACKTRACE_USES_MALLOC && BACKTRACE_SUPPORTS_THREADS
+#include "backtrace.h"
+#else
+#undef SANITIZER_LIBBACKTRACE
+#endif
+#endif
+
 // C++ demangling function, as required by Itanium C++ ABI. This is weak,
 // because we do not require a C++ ABI library to be linked to a program
 // using sanitizers; if it's not present, we'll just use the mangled name.
@@ -364,12 +374,124 @@ class InternalSymbolizer {
 
 #endif  // SANITIZER_SUPPORTS_WEAK_HOOKS
 
+#if SANITIZER_LIBBACKTRACE
+namespace {
+
+struct SymbolizeCodeData {
+  AddressInfo *frames;
+  uptr n_frames;
+  uptr max_frames;
+  const char *module_name;
+  uptr module_offset;
+};
+
+extern "C" {
+
+static int SymbolizeCodePCInfoCallback(void *vdata, uintptr_t addr,
+  const char *filename, int lineno,
+  const char *function) {
+  SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata;
+  if (function) {
+AddressInfo *info = &cdata->frames[cdata->n_frames++];
+info->Clear();
+info->FillAddressAndModuleInfo(addr, cdata->module_name, 
cdata->module_offset);
+info->function = internal_strdup(function);
+if (filename)
+  info->file = internal_strdup(filename);
+info->line = lineno;
+  }
+  return 0;
+}
+
+static void SymbolizeCodeCallback(void *vdata, uintptr_t addr,
+ const char *symname, uintptr_t) {
+  SymbolizeCodeData *cdata = (SymbolizeCodeData *)vdata;
+  if (symname) {
+AddressInfo *info = &cdata->frames[0];
+info->Clear();
+info->FillAddressAndModuleInfo(addr, cdata->module_name, 
cdata->module_offset);
+info->function = internal_strdup(symname);
+cdata->n_frames = 1;
+  }
+}
+
+static void SymbolizeDataCallback(void *vdata, uintptr_t,
+ const char *symname, uintptr_t symval) {
+  DataInfo *info = (DataInfo *)vdata;
+  if (symname && symval) {
+info->name = internal_strdup(symname);
+info->start = symval;
+  }
+}
+
+static void ErrorCallback(void *, const char *, int) {
+}
+
+}
+
+}
+
+class LibbacktraceSymbolizer {
+ public:
+  static LibbacktraceSymbolizer *get(LowLevelAllocator *alloc) {
+backtrace_state *state
+  = backtrace_create_state("/proc/self/exe", 1, ErrorCallback, NULL);
+if (!state)
+  return 0;
+return new(*alloc) LibbacktraceSymbolizer(state);
+  }
+
+  uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames,
+const char *module_name, uptr module_offset) {
+SymbolizeCodeData data;
+data.frames = frames;
+data.n_frames = 0;
+data.max_frames = max_frames;
+data.module_name = module_name;
+data.module_offset = module_offset;
+backtrace_pcinfo(state_, addr, SymbolizeCodePCInfoCallback, ErrorCallback,
+&data);
+if (data.n_frames)
+  return data.n_frames;
+backtrace_syminfo(state_, addr, SymbolizeCodeCallback, ErrorCallback, 
&data);
+return data.n_frames;
+  }
+
+  void SymbolizeData(DataInfo *info) {
+backtrace_syminfo(state_, info->address, SymbolizeDataCallback,
+ ErrorCallback, info);
+  }
+
+ private:
+  LibbacktraceSymbolizer(backtrace_state *state) : state_(state) { }
+
+  backtrace_state *state_; // Leaked.
+};
+#else
+class LibbacktraceSymbolizer {
+ public:
+  static LibbacktraceSymbolizer *get(LowLevelAllocator *) {
+return 0;
+  }
+
+  uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames,
+const char *module_name, uptr module_offset) {
+return 0;
+  }
+
+  void SymbolizeData(DataInfo *in

Re: [PATCH] Fix libbacktrace on prelinked shared libraries

2013-11-18 Thread Ian Lance Taylor
On Mon, Nov 18, 2013 at 5:11 AM, Jakub Jelinek  wrote:
>
> info->dlpi_addr == 0 is true for executables (but those have also
> info->dlpi_name set to ""), but not just for those, it is true also
> for successfully prelinked shared libraries or any other libraries that
> have been linked at non-zero offset and the dynamic linker managed to mmap
> them where they were prelinked resp. linked to.
>
> BTW, I'm surprised you aren't adding base_address (well, in fact it is
> actually not base address, but load bias, difference between load address
> and link base address) to elf_initialize_syminfo created symbol addresses,
> because all of them should be biased by the load address.
> And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols,
> those shouldn't be biased but not even entered into the table.

You're right, these are bugs.


> 2013-11-18  Jakub Jelinek  
>
> * elf.c (phdr_callback): Process info->dlpi_addr == 0 normally.

This is OK.

Thanks.

Ian


[PATCH] Add reference binding instrumentation

2013-11-18 Thread Marek Polacek
This incremental patch brings the implementation of reference binding
instrumentation, that is, it detects cases like

int *p = NULL;
int &r = *p;
or 
auto &&rr = *p;
and similarly.  It does so by adding a COMPOUND_EXPR to the decl.
As standard says, "A reference shall be initialized to refer to a
valid object or function.".

Jason, is that tiny C++ part ok with you?

Regtested/bootstrapped on x86_64-linux, ran bootstrap-ubsan, ok for
trunk?

2013-11-18  Marek Polacek  

c-family/
* c-ubsan.h (ubsan_instrument_reference): Declare.
* c-ubsan.c (ubsan_instrument_reference): New function.
cp/
* decl.c (cp_finish_decl): Instrument reference binding.
testsuite/
* g++.dg/ubsan/null-1.C: New test.

--- gcc/c-family/c-ubsan.h.mp2  2013-11-18 12:52:00.572671736 +0100
+++ gcc/c-family/c-ubsan.h  2013-11-18 12:52:25.751761970 +0100
@@ -24,5 +24,6 @@ along with GCC; see the file COPYING3.
 extern tree ubsan_instrument_division (location_t, tree, tree);
 extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
+extern tree ubsan_instrument_reference (location_t, tree);
 
 #endif  /* GCC_C_UBSAN_H  */
--- gcc/c-family/c-ubsan.c.mp2  2013-11-18 12:49:00.553025438 +0100
+++ gcc/c-family/c-ubsan.c  2013-11-18 15:26:53.835120271 +0100
@@ -179,3 +179,30 @@ ubsan_instrument_vla (location_t loc, tr
 
   return t;
 }
+
+/* Instrument reference binding, that is, ensure that the reference
+   declaration doesn't bind the reference to a NULL pointer.  */
+
+tree
+ubsan_instrument_reference (location_t loc, tree init)
+{
+  if (!INDIRECT_REF_P (init))
+/* This may happen, e.g. int &&r4 = p;, so don't put an assert here.  */
+return init;
+
+  init = TREE_OPERAND (init, 0);
+  tree eq_expr = fold_build2 (EQ_EXPR, boolean_type_node, init,
+ build_zero_cst (TREE_TYPE (init)));
+  const struct ubsan_mismatch_data m
+= { build_zero_cst (pointer_sized_int_node),
+   build_int_cst (unsigned_char_type_node, UBSAN_REF_BINDING)};
+  tree data = ubsan_create_data ("__ubsan_null_data",
+loc, &m,
+ubsan_type_descriptor (TREE_TYPE (init),
+   true), NULL_TREE);
+  data = build_fold_addr_expr_loc (loc, data);
+  tree fn = builtin_decl_implicit (BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH);
+  fn = build_call_expr_loc (loc, fn, 2, data,
+   build_zero_cst (pointer_sized_int_node));
+  return fold_build3 (COND_EXPR, void_type_node, eq_expr, fn, void_zero_node);
+}
--- gcc/cp/decl.c.mp2   2013-11-15 17:14:24.887512640 +0100
+++ gcc/cp/decl.c   2013-11-18 13:38:15.356077696 +0100
@@ -6216,6 +6217,11 @@ cp_finish_decl (tree decl, tree init, bo
  if (decl_maybe_constant_var_p (decl))
TREE_CONSTANT (decl) = 1;
}
+  if (flag_sanitize & SANITIZE_NULL
+ && TREE_CODE (type) == REFERENCE_TYPE)
+   init = fold_build2 (COMPOUND_EXPR, TREE_TYPE (init),
+   ubsan_instrument_reference (input_location, init),
+   init);
 }
 
   if (processing_template_decl)
--- gcc/testsuite/g++.dg/ubsan/null-1.C.mp2 2013-11-18 15:29:44.722831910 
+0100
+++ gcc/testsuite/g++.dg/ubsan/null-1.C 2013-11-18 15:29:54.744874505 +0100
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=null -w -std=c++11" } */
+
+typedef const long int L;
+
+int
+main (void)
+{
+  int *p = 0;
+  L *l = 0;
+
+  int &r = *p;
+  auto &r2 = *p;
+  L &lr = *l;
+
+  /* Try an rvalue reference.  */
+  auto &&r3 = *p;
+}
+
+/* { dg-output "reference binding to null pointer of type 'int'(\n|\r\n|\r)" } 
*/
+/* { dg-output "\[^\n\r]*reference binding to null pointer of type 
'int'(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*reference binding to null pointer of type 'const 
L'(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*reference binding to null pointer of type 
'int'(\n|\r\n|\r)" } */

Marek


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-18 Thread Jeff Law

On 11/08/13 02:02, Ilya Enkovich wrote:

Hi,

Here is an updated patch version with no langhook.

Regarding TLS objects issue - I do not think compiler should compensate the 
absence of instrumentation in libraries.  Compiler should be responsible for 
initialization of Bounds Tables for .tdata section.  Correct data copy is a 
responsibility of library.  User should use either instrumented library or 
wrapper calls if he needs this functionality.

Thanks,
Ilya
--
gcc/

2013-11-06  Ilya Enkovich  

* c/c-parser.c: Include tree-chkp.h.
(c_parser_declaration_or_fndef): Register statically
initialized decls in Pointer Bounds Checker.
* cp/decl.c: Include tree-chkp.h.
(cp_finish_decl): Register statically
initialized decls in Pointer Bounds Checker.
* gimplify.c: Include tree-chkp.h.
(gimplify_init_constructor): Register statically
initialized decls in Pointer Bounds Checker.
Is parsing really the right time to register these things with the 
checking framework?  Doesn't all this stuff flow through the gimplifier? 
 If so wouldn't that be a better place?


If it can be done in the gimplifier, which seems good from the 
standpoint of simplifying the long term maintenance of the checking code.


If there's a good reason to have this front-end, please explain it.

Thanks,
Jeff



Re: [PATCH] Fix libbacktrace on prelinked shared libraries

2013-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 08:29:05AM -0800, Ian Lance Taylor wrote:
> On Mon, Nov 18, 2013 at 5:11 AM, Jakub Jelinek  wrote:
> >
> > info->dlpi_addr == 0 is true for executables (but those have also
> > info->dlpi_name set to ""), but not just for those, it is true also
> > for successfully prelinked shared libraries or any other libraries that
> > have been linked at non-zero offset and the dynamic linker managed to mmap
> > them where they were prelinked resp. linked to.
> >
> > BTW, I'm surprised you aren't adding base_address (well, in fact it is
> > actually not base address, but load bias, difference between load address
> > and link base address) to elf_initialize_syminfo created symbol addresses,
> > because all of them should be biased by the load address.
> > And, elf_initialize_syminfo should ignore st_shndx == SHN_UNDEF symbols,
> > those shouldn't be biased but not even entered into the table.
> 
> You're right, these are bugs.

So what about this fix then?

Tested on a short testcase that links against a shared library without
debug info and looks up the symbol using backtrace_syminfo.
make check in libbacktrace also passes.

2013-11-18  Jakub Jelinek  

* elf.c (SHN_UNDEF): Define.
(elf_initialize_syminfo): Add base_address argument.  Ignore symbols
with st_shndx == SHN_UNDEF.  Add base_address to address fields.
(elf_add): Adjust caller.

--- libbacktrace/elf.c.jj   2013-11-18 12:48:07.0 +0100
+++ libbacktrace/elf.c  2013-11-18 17:41:38.903835173 +0100
@@ -98,6 +98,7 @@ dl_iterate_phdr (int (*callback) (struct
 #undef EV_CURRENT
 #undef SHN_LORESERVE
 #undef SHN_XINDEX
+#undef SHN_UNDEF
 #undef SHT_SYMTAB
 #undef SHT_STRTAB
 #undef SHT_DYNSYM
@@ -183,6 +184,7 @@ typedef struct {
   b_elf_wxword sh_entsize; /* Entry size if section holds table */
 } b_elf_shdr;  /* Elf_Shdr.  */
 
+#define SHN_UNDEF  0x  /* Undefined section */
 #define SHN_LORESERVE  0xFF00  /* Begin range of reserved indices */
 #define SHN_XINDEX 0x  /* Section index is held elsewhere */
 
@@ -342,6 +344,7 @@ elf_symbol_search (const void *vkey, con
 
 static int
 elf_initialize_syminfo (struct backtrace_state *state,
+   uintptr_t base_address,
const unsigned char *symtab_data, size_t symtab_size,
const unsigned char *strtab, size_t strtab_size,
backtrace_error_callback error_callback,
@@ -365,7 +368,8 @@ elf_initialize_syminfo (struct backtrace
   int info;
 
   info = sym->st_info & 0xf;
-  if (info == STT_FUNC || info == STT_OBJECT)
+  if ((info == STT_FUNC || info == STT_OBJECT)
+ && sym->st_shndx != SHN_UNDEF)
++elf_symbol_count;
 }
 
@@ -385,6 +389,8 @@ elf_initialize_syminfo (struct backtrace
   info = sym->st_info & 0xf;
   if (info != STT_FUNC && info != STT_OBJECT)
continue;
+  if (sym->st_shndx == SHN_UNDEF)
+   continue;
   if (sym->st_name >= strtab_size)
{
  error_callback (data, "symbol string index out of range", 0);
@@ -393,7 +399,7 @@ elf_initialize_syminfo (struct backtrace
  return 0;
}
   elf_symbols[j].name = (const char *) strtab + sym->st_name;
-  elf_symbols[j].address = sym->st_value;
+  elf_symbols[j].address = sym->st_value + base_address;
   elf_symbols[j].size = sym->st_size;
   ++j;
 }
@@ -733,7 +739,7 @@ elf_add (struct backtrace_state *state,
   if (sdata == NULL)
goto fail;
 
-  if (!elf_initialize_syminfo (state,
+  if (!elf_initialize_syminfo (state, base_address,
   symtab_view.data, symtab_shdr->sh_size,
   strtab_view.data, strtab_shdr->sh_size,
   error_callback, data, sdata))


Jakub


Re: [PATCH] Implement -fsanitize=null + new sanopt pass

2013-11-18 Thread Joseph S. Myers
On Mon, 18 Nov 2013, Marek Polacek wrote:

> +@item @option{-fsanitize=shift}
> +
> +This option enables checking that the result of a shift operation is
> +not undefined.  Note that what exactly is considered undefined differs
> +slightly between C and C++, as well as between ANSI C and C99, etc.

We generally refer to ISO C90, not ANSI C.

> +Detect integer division by zero as well as @code{INT_MIN / -1} division.
> +Note that the latter is only made undefined from C99 onwards.

INT_MIN / -1 is unambiguously undefined in C90 - it's a signed arithmetic 
overflow (result not within the range of its type).  It's INT_MIN % -1 
where there's more ambiguity, but I consider the wording changes in C11 as 
a defect correction that should be applied back to C90.  (A comment on 
what the semantics should be, not on whether the documentation accurately 
reflects the code.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-18 Thread Ian Lance Taylor
On Mon, Nov 18, 2013 at 5:39 AM, Jakub Jelinek  wrote:
>
> I've kept the possibility to override the symbolizer by magic symbols
> (InternalSymbolizer), and as I have no idea how the llvm buildsystem etc.
> works and what is the possibility there to add libbacktrace, this just
> requires users to compile with SANITIZE_LIBBACKTRACE defined to signal
> that backtrace-supported.h and backtrace.h is available and the
> sanitizer_symbolizer_posix_libcdep.cc source then decides based on
> backtrace-supported.h etc. whether it is usable.

Just FYI for those who don't know, libbacktrace is under a BSD license
so there should be no licensing objection to using it in LLVM.


> 1) right now libbacktrace performs dl_iterate_phdr only the first time
>backtrace_pcinfo or backtrace_syminfo is called, if there are some
>dlopens/dlcloses in between that and another querying of the symbolizer,
>it won't notice that.  Perhaps it can be done only when we don't
>find a symbol and/or have some function that tries to dl_iterare_phdr
>again, look at cached st_ino/st_mtime or similar, and for threaded
>version likely just keep old records, just add a flag to them that they
>should be ignored (or say atomically decrease symbol count to zero
>and something similar for debug info).

Yes, this is a FIXME in dwarf_fileline in libbacktrace/dwarf.c.


> 2) for tsan querying of data symbols, apparently the classes want to see
>not just the symbol name and start value, but also size.  libbacktrace
>has all this info available, just doesn't pass it down to the callback.
>I wonder if we'd need to create yet another libbacktrace entrypoint, or
>if it would be acceptable to do source code incompatible, ABI (at least
>on all sane targets) compatible version of just adding another
>uintptr_t symsize argument to backtrace_syminfo_callback.

I think it would be fine to change the callback.  I doubt that
libbacktrace is so widely used that we need to worry about backward
compatibility at this stage.  In particular I imagine that any users
of libbacktrace are simply copying the source code, since there is no
installable package.


> 3) I wonder if libbacktrace couldn't be updated to use __atomic_* builtins,
>then it could avoid the ugliness to emulate atomic loads and stores.

I think that would be fine.  I can't remember why I didn't use the
__atomic builtins.  Maybe they weren't quite ready at the time.

Ian


Re: [PATCH] Fix libbacktrace on prelinked shared libraries

2013-11-18 Thread Ian Lance Taylor
On Mon, Nov 18, 2013 at 8:46 AM, Jakub Jelinek  wrote:
>
> 2013-11-18  Jakub Jelinek  
>
> * elf.c (SHN_UNDEF): Define.
> (elf_initialize_syminfo): Add base_address argument.  Ignore symbols
> with st_shndx == SHN_UNDEF.  Add base_address to address fields.
> (elf_add): Adjust caller.

This is OK.

Thanks.

Ian


Re: Add value range support into memcpy/memset expansion

2013-11-18 Thread H.J. Lu
On Mon, Nov 18, 2013 at 6:19 AM, H.J. Lu  wrote:
> On Sun, Nov 17, 2013 at 3:38 PM, Jan Hubicka  wrote:
>> Hi,
>> this is version I comitted. It also adds a testcase and enables the support 
>> in i386 backend.
>>
>> Honza
>>
>> * doc/md.texi (setmem, movstr): Update documentation.
>> * builtins.c (determine_block_size): New function.
>> (expand_builtin_memcpy): Use it and pass it to
>> emit_block_move_hints.
>> (expand_builtin_memset_args): Use it and pass it to
>> set_storage_via_setmem.
>> * expr.c (emit_block_move_via_movmem): Add min_size/max_size 
>> parameters;
>> update call to expander.
>> (emit_block_move_hints): Add min_size/max_size parameters.
>> (clear_storage_hints): Likewise.
>> (set_storage_via_setmem): Likewise.
>> (clear_storage): Update.
>> * expr.h (emit_block_move_hints, clear_storage_hints,
>> set_storage_via_setmem): Update prototype.
>> * i386.c (ix86_expand_set_or_movmem): Add bounds; export.
>> (ix86_expand_movmem, ix86_expand_setmem): Remove.
>> (ix86_expand_movmem, ix86_expand_setmem): Remove.
>> * i386.md (movmem, setmem): Pass parameters.
>>
>> * testsuite/gcc.target/i386/memcpy-2.c: New testcase.
>
> The new testcase fails for me on x86 and x86-64:
>
> FAIL: gcc.target/i386/memcpy-2.c scan-assembler-not memcpy
> FAIL: gcc.target/i386/memcpy-2.c (test for excess errors)
>

I got

[hjl@gnu-6 gcc]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c
-fno-diagnostics-show-caret -fdiagnostics-color=never  -O2
-ffat-lto-objects -ffat-lto-objects -S  -m32 -o memcpy-2.s
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c:
In function ‘t’:
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/memcpy-2.c:10:5:
warning: incompatible implicit declaration of built-in function
‘memcpy’ [enabled by default]
[hjl@gnu-6 gcc]$ cat memcpy-2.s
.file"memcpy-2.c"
.text
.p2align 4,,15
.globlt
.typet, @function
t:
.LFB0:
.cfi_startproc
subl$28, %esp
.cfi_def_cfa_offset 32
movl32(%esp), %eax
cmpl$9, %eax
ja.L3
movl%eax, 8(%esp)
movlb, %eax
movl%eax, 4(%esp)
movla, %eax
movl%eax, (%esp)
callmemcpy
.L3:
addl$28, %esp
.cfi_def_cfa_offset 4
ret
.cfi_endproc


/* { dg-final { scan-assembler-not "memcpy" } } */

will also match

.file"memcpy-2.c"

-- 
H.J.


  1   2   3   >