Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Maxim Ostapenko

Thank you all for your help!

Done in r215633.

-Maxim
On 09/25/2014 11:05 PM, Jeff Law wrote:

On 09/23/14 01:14, Maxim Ostapenko wrote:



2014-09-04  Jakub Jelinek
Max Ostapenko

* common.opt: New option.
* doc/invoke.texi: Describe new option.
* gcc.c (execute): Don't free first string early, but at the end
of the function.  Call retry_ice if compiler exited with
ICE_EXIT_CODE.
(main): Factor out common code.
(print_configuration): New function.
(files_equal_p): Likewise.
(check_repro): Likewise.
(run_attempt): Likewise.
(do_report_bug): Likewise.
(append_text): Likewise.
(try_generate_repro): Likewise

Approved.  Please install.

Thanks for your patience,
Jeff






Re: [debug-early] fix fortran regressions

2014-09-26 Thread Richard Biener
On Thu, Sep 25, 2014 at 8:11 PM, Aldy Hernandez  wrote:
> push_cfun() fails when there's no cfun stack.  With this patch, we use
> set_cfun if not stack is available.
>
> This fixes the 16 Fortran guality regressions.
>
> Now guality tests all pass, for all languages.
>
> Committed to branch.

Hmm, I'd rather avoid push_cfun completely.  It seems that mainline
doesn't have it?  Note that push_cfun also does target specific
switching which shouldn't be necessary.

Eventually dwarf2out.c wants some own "context"?

That is, the type/decl part of dwarf2out.c should work solely with
current_function_decl (which you can simply change/restore)
while the backend part (locations, etc.) should work using cfun.

So - please try dropping push_cfun as you set current_function_decl
anyway.

Richard.


Re: Fix TYPE_MAIN_VARIANT made by ipa-prop

2014-09-26 Thread Richard Biener
On Fri, Sep 26, 2014 at 12:14 AM, Jan Hubicka  wrote:
> Hi,
> according to my type checker, ipa-prop is only place where we produce a 
> variant of
> a METHOD_TYPE that is FUNCTION_TYPE or viarant that has different parameters.
>
> The code in question is producing new prototype to remove unused arguments,
> but I do not think it should eclare the new type to be variant of the 
> original.
>
> Historically I think I introduced the code because I simply copied what linker
> does without caring too much about the consequences.

I agree.

> Bootstrapped/regtested x86_64-linux, OK?

Yes.

Thanks,
Richard.

> * ipa-prop.c (ipa_modify_formal_parameters): Do not declare new type
> to be variant of original type.
>
> Index: ipa-prop.c
> ===
> --- ipa-prop.c  (revision 215615)
> +++ ipa-prop.c  (working copy)
> @@ -3990,21 +3990,6 @@
>DECL_FUNCTION_CODE (fndecl) = (enum built_in_function) 0;
>  }
>
> -  /* This is a new type, not a copy of an old type.  Need to reassociate
> - variants.  We can handle everything except the main variant lazily.  */
> -  tree t = TYPE_MAIN_VARIANT (orig_type);
> -  if (orig_type != t)
> -{
> -  TYPE_MAIN_VARIANT (new_type) = t;
> -  TYPE_NEXT_VARIANT (new_type) = TYPE_NEXT_VARIANT (t);
> -  TYPE_NEXT_VARIANT (t) = new_type;
> -}
> -  else
> -{
> -  TYPE_MAIN_VARIANT (new_type) = new_type;
> -  TYPE_NEXT_VARIANT (new_type) = NULL;
> -}
> -
>TREE_TYPE (fndecl) = new_type;
>DECL_VIRTUAL_P (fndecl) = 0;
>DECL_LANG_SPECIFIC (fndecl) = NULL;


Re: [shrink-wrap] should not sink instructions which may cause trap ?

2014-09-26 Thread Richard Biener
On Fri, Sep 26, 2014 at 12:49 AM, Jiong Wang
 wrote:
> 2014-09-25 14:07 GMT+01:00 Jiong Wang :
>>
>> On 25/09/14 12:25, Christophe Lyon wrote:
>>>

>>> I have observed regressions in the g++ testsuite: pr49847 now FAILs
>>> after this patch.
>>
>> no.
>>
>> even without my patch, the regression still happen.
>>
>> or you could specify -fno-shrink-wrap, gcc still crash.
>>
>> so, this regression should caused by other commits which haven't exposed on
>> x86 regression test.
>
> sorry, confirmed, there is regression.
>
> my code was git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215590.
> there also be gcc crash on aarch64, with the following info,
>   pr49847.C:5:21: internal compiler error: Segmentation fault
>  try { return g >= 0; }
>  ^
>   0xdc249e crash_signal
>   ../../gcc/gcc/toplev.c:340
>   0xdbfeff default_get_reg_raw_mode(int)
>
> so I was thinking it's caused by other commits instead of this, and after I 
> sync
> code to git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215599 I could
> reproduce this bug.
>
>   error: missing REG_EH_REGION note at the end of bb 2
>
> the reson is:
>   * before this patch, we only sink simple "set reg, reg" instruction which
> the corresponding instruction will not produce exception, thus no
> REG_EH_REGION attached.
>   * after this patch, we will sink instruction like the following for
> aarch64 or arm or other RISC.
>
> (insn 7 3 30 2 (set (reg:CCFPE 66 cc)
> (compare:CCFPE (reg:SF 32 v0 [ g ])
> (const_double:SF 0.0 [0x0.0p+0]))) pr49847.C:5 330 {*cmpesf}
>  (expr_list:REG_DEAD (reg:SF 32 v0 [ g ])
> (expr_list:REG_EH_REGION (const_int 1 [0x1])
> (nil
>
>   "compare" is actually a operator which may cause trap and we need to prevent
>   any instruction which may causing trap be sink, because that may
> break exception handling logic
>
>   so something like the following should be added to the iterator check
>
>   if (may_trap_p (x))
> don't sink this instruction.
>
>any comments?

Should be checking if x may throw internally instead.

Richard.

>I will try to send a fix tomorrow.
>
>thanks.
>
> -- Jiong
>
>
>>
>> -- Jiong
>>
>>
>>>
>>> Here is what I have in my logs:
>>>
>>> /aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/gcc/testsuite/g++/../../xg++
>>>
>>> -B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/gcc/testsuite/g++/../../
>>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C
>>> -fno-diagnostics-show-caret -fdiagnostics-color=never  -nostdinc++
>>>
>>> -I/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/arm-none-linux-gnueabihf/libstdc++-v3/include/arm-none-linux-gnueabihf
>>>
>>> -I/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/arm-none-linux-gnueabihf/libstdc++-v3/include
>>> -I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
>>> -I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/include/backward
>>> -I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/util
>>> -fmessage-length=0  -std=gnu++98 -O -fnon-call-exceptions  -S -o
>>> pr49847.s(timeout = 800)
>>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C: In
>>> function 'int f(float)':
>>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C:7:1:
>>> error: missing REG_EH_REGION note at the end of bb 2
>>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C:7:1:
>>> internal compiler error: verify_flow_info failed
>>> 0x82f8ba verify_flow_info()
>>>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfghooks.c:260
>>>
>>> 0x840cd3 commit_edge_insertions()
>>>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgrtl.c:2068
>>> 0x9bf243 thread_prologue_and_epilogue_insns
>>>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5852
>>> 0x9bfa52 rest_of_handle_thread_prologue_and_epilogue
>>>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:6245
>>> 0x9bfa52 execute
>>>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:6283
>>>
>>> As per
>>>
>>> http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215563/report-build-info.html
>>> I've noticed this on targets:
>>> arm-none-linux-gnueabihf
>>> armeb-none-linux-gnueabihf
>>> aarch64-none-elf
>>> aarch64_be-none-elf
>>> aarch64-none-linux-gnu
>>> but NOT on
>>> arm-none-eabi
>>> arm-none-linux-gnueabi
>>>
>>> Christophe.
>>>
>>
>>


[PATCH] Support for BIT_FIELD_REF in asan.c

2014-09-26 Thread Marat Zakirov

Hi all!

Here's a patch which instruments byte-aligned BIT_FIELD_REFs. During GCC 
asan-bootstrap and Linux kernel build I didn't find any cases where 
BIT_FIELD_REF is not 8 bits aligned. But I do not have sufficient 
confidence to replace current return if BIT_FIELD_REF is misaligned to 
assert.


Ok to commit?

--Marat
gcc/ChangeLog:

2014-09-19  Marat Zakirov  

	* asan.c (instrument_derefs): BIT_FIELD_REF added.

gcc/testsuite/ChangeLog:

2014-09-19  Marat Zakirov  

	* c-c++-common/asan/bitfield-5.c: New test.

diff --git a/gcc/asan.c b/gcc/asan.c
index cf5de27..451af33 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1705,6 +1705,7 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
 case INDIRECT_REF:
 case MEM_REF:
 case VAR_DECL:
+case BIT_FIELD_REF:
   break;
   /* FALLTHRU */
 default:
diff --git a/gcc/testsuite/c-c++-common/asan/bitfield-5.c b/gcc/testsuite/c-c++-common/asan/bitfield-5.c
new file mode 100644
index 000..eb5e9e9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/bitfield-5.c
@@ -0,0 +1,24 @@
+/* Check BIT_FIELD_REF.  */
+
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+
+struct A
+{
+  int y : 20;
+  int x : 13;
+};
+
+int __attribute__ ((noinline, noclone))
+f (void *p) {
+  return ((struct A *)p)->x != 0;
+}
+
+int
+main ()
+{
+  int a = 0;
+  return f (&a);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow" } */



Re: [Patch 2/4] Hack out a use of MOVE_RATIO in tree-inline.c

2014-09-26 Thread Richard Biener
On Thu, Sep 25, 2014 at 4:57 PM, James Greenhalgh
 wrote:
>
> Hi,
>
> This patch hookizes the use of MOVE_RATIO in
> tree-inline.c:estimate_move_cost as TARGET_ESTIMATE_BLOCK_COPY_NINSNS.
> This hook should return an estimate for the number of instructions
> which will be emitted to copy a block of memory.
>
> tree-inline.c uses this in inlining heuristics to estimate the cost of
> moving an object. The implementation is lacking, and will likely
> underestimate the size of most copies.
>
> An initial iteration of this patch migrated tree-inline.c to use
> move_by_pieces_profitable_p and move_by_pieces_ninsns, but this
> proved painful for performance on ARM.
>
> This patch puts the control in the hands of the backend, and uses
> the existing logic as a default.
>
> Bootstrapped on x86_64, ARM, AArch64.
>
> Ok?

Note that if you are here then one issue is that the inliner uses
this very same function to estimate cost of function parameters/returns
that are eventually passed/returned in registers.  That's of course
a pre-existing issue.

+ "This target hook should return an estimate of the number of\n\
+instructions which will be emitted when copying an object with a size\n\
+in units @var{size}.\n\

I'm confused by this sentence.  Doesn't it mean to say
"when copying an object with size @var{size} in units of word_mode."?

It's always difficult when transforming a heuristic using existing
target macros to a new hook.  It would be best to think about the
heuristic itself again and make the hook more closely match
the uses of the heuristic.  In this case it would mean splitting
this up into the load/store and the function parameter case.

Note that estimate_move_cost is used elsewhere as well.

Richard.

> Thanks,
> James
>
> ---
> 2014-09-25  James Greenhalgh  
>
> * target.def (estimate_block_copy_ninsns): New.
> * targhooks.h (default_estimate_block_copy_ninsns): New.
> * targhooks.c (default_estimate_block_copy_ninsns): New.
> * tree-inline.c (estimate_move_cost): Use new target hook.
> * doc/tm.texi.in (TARGET_ESTIMATE_BLOCK_COPY_NINSNS): New.
> * doc/tm.texi: Regenerate.


[PATCH i386 AVX512] [57/n] Extend blend/cmp/brodcast insn patterns.

2014-09-26 Thread Kirill Yukhin
Hello,
Patch in the bottom extends blend/cmp/brodcast
insn patterns.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_insn "avx512f_blendm"): Delete.
(define_insn "_blendm"): New.
(define_insn "_blendm"): Ditto..
(define_mode_attr cmp_imm_predicate): Add V8SF, V4DF, V8SI, V4DI, V4SF,
V2DF, V4SI, V2DI, V32HI, V64QI, V16HI, V32QI, V8HI, V16QI modes.
(define_insn
"avx512f_cmp3"):
Remove.
(define_insn

"_cmp3"):
New.
(define_insn

"_cmp3"):
Ditto.
(define_insn "avx512f_vec_dup"): Delete.
(define_insn "_vec_dup"): New.
(define_insn "_vec_dup"): Ditto.
(define_insn "avx512f_vec_dup_gpr"):
Delete.
(define_insn
"_vec_dup_gpr"):
New.
(define_insn
"_vec_dup_gpr"):
Ditto.
(define_insn·"avx512f_vec_dup_mem"):
Delete.
(define_insn
"_vec_dup_mem"):
New.
(define_insn
"_vec_dup_mem"):
Ditto.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9edfebc..43d6655 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -954,14 +954,26 @@
(set_attr "memory" "none,load")
(set_attr "mode" "")])
 
-(define_insn "avx512f_blendm"
-  [(set (match_operand:VI48F_512 0 "register_operand" "=v")
-   (vec_merge:VI48F_512
- (match_operand:VI48F_512 2 "nonimmediate_operand" "vm")
- (match_operand:VI48F_512 1 "register_operand" "v")
+(define_insn "_blendm"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
+   (vec_merge:V48_AVX512VL
+ (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
+ (match_operand:V48_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
   "TARGET_AVX512F"
-  "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
+  "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "_blendm"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v")
+   (vec_merge:VI12_AVX512VL
+ (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
+ (match_operand:VI12_AVX512VL 1 "register_operand" "v")
+ (match_operand: 3 "register_operand" "Yk")))]
+  "TARGET_AVX512BW"
+  "vpblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
   [(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "")])
@@ -2467,14 +2479,21 @@
(set_attr "mode" "")])
 
 (define_mode_attr cmp_imm_predicate
-  [(V16SF "const_0_to_31_operand") (V8DF "const_0_to_31_operand")
-  (V16SI "const_0_to_7_operand") (V8DI "const_0_to_7_operand")])
-
-(define_insn "avx512f_cmp3"
+  [(V16SF "const_0_to_31_operand")  (V8DF "const_0_to_31_operand")
+   (V16SI "const_0_to_7_operand")   (V8DI "const_0_to_7_operand")
+   (V8SF "const_0_to_31_operand")   (V4DF "const_0_to_31_operand")
+   (V8SI "const_0_to_7_operand")(V4DI "const_0_to_7_operand")
+   (V4SF "const_0_to_31_operand")   (V2DF "const_0_to_31_operand")
+   (V4SI "const_0_to_7_operand")(V2DI "const_0_to_7_operand")
+   (V32HI "const_0_to_7_operand")   (V64QI "const_0_to_7_operand")
+   (V16HI "const_0_to_7_operand")   (V32QI "const_0_to_7_operand")
+   (V8HI "const_0_to_7_operand")(V16QI "const_0_to_7_operand")])
+
+(define_insn "_cmp3"
   [(set (match_operand: 0 "register_operand" "=Yk")
(unspec:
- [(match_operand:VI48F_512 1 "register_operand" "v")
-  (match_operand:VI48F_512 2 "" 
"")
+ [(match_operand:V48_AVX512VL 1 "register_operand" "v")
+  (match_operand:V48_AVX512VL 2 "nonimmediate_operand" 
"")
   (match_operand:SI 3 "" "n")]
  UNSPEC_PCMP))]
   "TARGET_AVX512F && "
@@ -2484,6 +2503,20 @@
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_insn "_cmp3"
+  [(set (match_operand: 0 "register_operand" "=Yk")
+   (unspec:
+ [(match_operand:VI12_AVX512VL 1 "register_operand" "v")
+  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
+  (match_operand:SI 3 "" "n")]
+ UNSPEC_PCMP))]
+  "TARGET_AVX512BW"
+  "vpcmp\t{%3, %2, %1, 
%0|%0, %1, %2, %3}"
+  [(set_attr "type" "ssecmp")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
 (define_insn "avx512f_ucmp3"
   [(set (match_operand: 0 "register_operand" "=Yk")
(unspec:
@@ -16207,13 +16240,13 @@
#"
   [(set_attr "type" "ssemov")
(set_attr "prefix_extra" "1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "isa" "*,avx2,noavx2")
(set_attr "mode" "V8SF")])
 
-(define_insn "avx512f_vec_dup"
-  [(set (match_operand:VI48F_512 0 "register_operand" "=v")
-   (vec_duplicate:VI48F_512
+(define_insn "_vec_dup"
+  [(set (match_

Re: [Patchv2 3/4] Control SRA and IPA-SRA by a param rather than MOVE_RATIO

2014-09-26 Thread Richard Biener
On Thu, Sep 25, 2014 at 4:57 PM, James Greenhalgh
 wrote:
>
> Hi,
>
> After hookizing MOVE_BY_PIECES_P and migrating tree-inline.c, we are
> left with only one user of MOVE_RATIO - deciding the maximum size of
> aggregate for SRA.
>
> Past discussions have made it clear [1] that keeping this use of
> MOVE_RATIO is undesirable. Clearly it is now also misnamed.
>
> The previous iteration of this patch was rejected as too complicated. I
> went off and tried simplifying it to use MOVE_RATIO, but if we do that we
> end up breaking some interface boundaries between the driver and the
> backend.
>
> This patch partially hookizes MOVE_RATIO under the new name
> TARGET_MAX_SCALARIZATION_SIZE and uses it to set default values for two
> new parameters:
>
>   sra-max-scalarization-size-Ospeed - The maximum size of aggregate
>   to consider when compiling for speed
>   sra-max-scalarization-size-Osize - The maximum size of aggregate
>   to consider when compiling for size.
>
> We then modify SRA to use these parameters rather than MOVE_RATIO.
>
> Bootstrapped and regression tested for x86, arm and aarch64 with no
> issues.
>
> OK for trunk?

+/* Return the maximum size in bytes of aggregate which will be considered
+   for replacement by SRA/IP-SRA.  */
+DEFHOOK
+(max_scalarization_size,
+ "This target hook is used by the Scalar Replacement of Aggregates passes\n\
+(SRA and IPA-SRA).  This hook gives the maximimum size, in storage units,\n\
+of aggregate to consider for replacement.  @var{speed_p} is true if we are\n\
+currently compiling for speed.\n\
+\n\
+By default, the maximum scalarization size is determined by MOVE_RATIO,\n\
+if it is defined.  Otherwise, a sensible default is chosen.\n\

doesn't match

+unsigned int
+default_max_scalarization_size (bool speed_p ATTRIBUTE_UNUSED)
+{
+  return get_move_ratio (speed_p) * MOVE_MAX_PIECES;

+unsigned int
+get_max_scalarization_size (bool speed_p)
+{
+  unsigned param_max_scalarization_size
+= speed_p
+  ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)
+  : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE);
+
+  if (!param_max_scalarization_size)
+return targetm.max_scalarization_size (speed_p);
+

the target-hook takes a size_p parameter, here you have a speed_p
parameter but call it as

+  unsigned i;
+  unsigned int max_scalarization_size
+= get_max_scalarization_size (optimize_function_for_size_p (cfun))
+  * BITS_PER_UNIT;

there is some mismatch.  Not sure if we generally prefer speed_p
over size_p, grepping headers shows zero size_p parameters and
some speed_p ones.

Given the special value to note the default for the new --params is
zero a user cannot disable scalarization that way.

I still somehow dislike that you need a target hook to compute the
default.  Why doesn't it work to do, in opts.c:default_options_optimization

maybe_set_param_value
  (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED,
   get_move_ratio (speed_p) * MOVE_MAX_PIECES,
   opts->x_param_values, opts_set->x_param_values);

and override that default in targets option_override hook the same way?

Thanks,
Richard.

> [1]: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01997.html
>
> ---
> gcc/
>
> 2014-09-25  James Greenhalgh  
>
> * doc/invoke.texi (sra-max-scalarization-size-Ospeed): Document.
> (sra-max-scalarization-size-Osize): Likewise.
> * doc/tm.texi.in
> (MOVE_RATIO): Reduce documentation to a stub, deprecate.
> (TARGET_MAX_SCALARIZATION_SIZE): Add hook.
> * doc/tm.texi: Regenerate.
> * defaults.h (MOVE_RATIO): Remove default implementation.
> (SET_RATIO): Add a default implementation if MOVE_RATIO
> is not defined.
> * params.def (sra-max-scalarization-size-Ospeed): New.
> (sra-max-scalarization-size-Osize): Likewise.
> * target.def (max_scalarization_size): New.
> * targhooks.c (default_max_scalarization_size): New.
> * targhooks.h (default_max_scalarization_size): New.
> * tree-sra.c (get_max_scalarization_size): New.
> (analyze_all_variable_accesses): Use it.


Re: [Patch 1/4] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO

2014-09-26 Thread Richard Biener
On Thu, Sep 25, 2014 at 5:08 PM, Steven Bosscher  wrote:
> On Thu, Sep 25, 2014 at 4:57 PM, James Greenhalgh wrote:
>> * doc/tm.texi.in (MOVE_BY_PIECES_P): Reduce documentation to a stub
>> describing that this macro is deprecated.
>
> Remove it entirely and poison it in system.h?
> It takes changes to only a few targets: mips, arc, s390, and sh.
>
> Thanks for hookizing this!

Indeed.

The patch is ok - please consider doing what Steven suggested as
followup.

Thanks,
Richard.

> Ciao!
> Steven


[PATCH][match-and-simplify] Merge some more code-gen code

2014-09-26 Thread Richard Biener

Bootstrapped on x86_64-unknown-linux-gnu, applied.

Richard.

2014-09-26  Richard Biener  

* genmatch.c (struct dt_node): Merge gen_gimple and gen_generic
into gen, merge gen_gimple_kids and gen_generic_kids into
gen_kids.
(struct dt_operand): Likewise.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215600)
+++ gcc/genmatch.c  (working copy)
@@ -927,11 +927,9 @@ struct dt_node
   dt_node *append_match_op (dt_operand *, dt_node *parent = 0, unsigned pos = 
0);
   dt_node *append_simplify (simplify *, unsigned, dt_operand **); 
 
-  virtual void gen_gimple (FILE *) {}
-  virtual void gen_generic (FILE *) {}
+  virtual void gen (FILE *, bool) {}
 
-  void gen_gimple_kids (FILE *);
-  void gen_generic_kids (FILE *);
+  void gen_kids (FILE *, bool);
 };
 
 struct dt_operand : public dt_node
@@ -946,8 +944,7 @@ struct dt_operand : public dt_node
   : dt_node (type), op (op_), match_dop (match_dop_),
   parent (parent_), pos (pos_) {}
 
-  virtual void gen_gimple (FILE *);
-  virtual void gen_generic (FILE *);
+  void gen (FILE *, bool);
   unsigned gen_predicate (FILE *, const char *, bool);
   unsigned gen_match_op (FILE *, const char *);
 
@@ -969,8 +966,6 @@ struct dt_simplify : public dt_node
  indexes (indexes_)  {}
 
   void gen (FILE *f, bool);
-  virtual void gen_gimple (FILE *f) { gen (f, true); }
-  virtual void gen_generic (FILE *f) { gen (f, false); }
 };
 
 template<>
@@ -1589,14 +1584,16 @@ dt_operand::gen_generic_expr (FILE *f, c
 }
 
 void
-dt_node::gen_gimple_kids (FILE *f)
+dt_node::gen_kids (FILE *f, bool gimple)
 {
   auto_vec gimple_exprs;
   auto_vec generic_exprs;
   auto_vec fns;
+  auto_vec generic_fns;
   auto_vec preds;
   auto_vec others;
   dt_node *true_operand = NULL;
+
   for (unsigned i = 0; i < kids.length (); ++i)
 {
   if (kids[i]->type == dt_node::DT_OPERAND)
@@ -1607,11 +1604,21 @@ dt_node::gen_gimple_kids (FILE *f)
  if (e->ops.length () == 0)
generic_exprs.safe_push (op);
  else if (e->operation->kind == id_base::FN)
-   fns.safe_push (op);
+   {
+ if (gimple)
+   fns.safe_push (op);
+ else
+   generic_fns.safe_push (op);
+   }
  else if (e->operation->kind == id_base::PREDICATE)
preds.safe_push (op);
  else
-   gimple_exprs.safe_push (op);
+   {
+ if (gimple)
+   gimple_exprs.safe_push (op);
+ else
+   generic_exprs.safe_push (op);
+   }
}
  else if (op->op->type == operand::OP_PREDICATE)
others.safe_push (kids[i]);
@@ -1633,13 +1640,16 @@ dt_node::gen_gimple_kids (FILE *f)
   unsigned exprs_len = gimple_exprs.length ();
   unsigned gexprs_len = generic_exprs.length ();
   unsigned fns_len = fns.length ();
+  unsigned gfns_len = generic_fns.length ();
 
-  if (exprs_len || fns_len || gexprs_len)
+  if (exprs_len || fns_len || gexprs_len || gfns_len)
 {
   if (exprs_len)
gimple_exprs[0]->get_name (kid_opname);
   else if (fns_len)
fns[0]->get_name (kid_opname);
+  else if (gfns_len)
+   generic_fns[0]->get_name (kid_opname);
   else
generic_exprs[0]->get_name (kid_opname);
 
@@ -1667,7 +1677,7 @@ dt_node::gen_gimple_kids (FILE *f)
  else
fprintf (f, "case %s:\n", op->id);
  fprintf (f, "{\n");
- gimple_exprs[i]->gen_gimple (f);
+ gimple_exprs[i]->gen (f, true);
  fprintf (f, "break;\n"
   "}\n");
}
@@ -1691,7 +1701,7 @@ dt_node::gen_gimple_kids (FILE *f)
  expr *e = as_a (fns[i]->op);
  fprintf (f, "case %s:\n"
   "{\n", e->operation->id);
- fns[i]->gen_gimple (f);
+ fns[i]->gen (f, true);
  fprintf (f, "break;\n"
   "}\n");
}
@@ -1708,16 +1718,44 @@ dt_node::gen_gimple_kids (FILE *f)
   for (unsigned i = 0; i < generic_exprs.length (); ++i)
 {
   expr *e = as_a (generic_exprs[i]->op);
-  fprintf (f, "case %s:\n"
-  "{\n", e->operation->id);
-
-  generic_exprs[i]->gen_gimple (f);
+  id_base *op = e->operation;
+  /* ??? CONVERT */
+  fprintf (f, "case %s:\n", op->id);
+  fprintf (f, "{\n");
+  generic_exprs[i]->gen (f, gimple);
   fprintf (f, "break;\n"
   "}\n");
 }
 
+  if (gfns_len)
+{
+  fprintf (f, "case CALL_EXPR:\n"
+  "{\n"
+  "tree fndecl = get_callee_fndecl (%s);\n"
+  "if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
BUILT_IN_NORMAL)\n"
+  "switch (DECL_FUNCTION_CODE (fndecl))\n"
+  "{\n", kid_opname);
+
+  for (unsigned

Re: [Bug libstdc++/62313] Data race in debug iterators

2014-09-26 Thread Jonathan Wakely

On 26/09/14 00:00 +0200, François Dumont wrote:



Apart from those minor adjustments I think this looks good, but I'd
like to know that it has been tested with -fsanitize=thread, even if
only lightly tested.




Hi

   Dmitry, who reported the bug, confirmed the fix. Can I go ahead 
and commit ?


Yes, OK.


Re: [PATCH, i386, Pointer Bounds Checker 33/x] MPX ABI

2014-09-26 Thread Ilya Enkovich
Adding Vladimir.

Ilya

2014-09-25 13:46 GMT+04:00 Ilya Enkovich :
> 2014-09-25 1:51 GMT+04:00 Ilya Enkovich :
>> 2014-09-24 23:09 GMT+04:00 Jeff Law :
>>> On 09/24/14 07:13, Ilya Enkovich wrote:

 I tried to generate PARALLEL with all regs set by call.  Here is a
 memset call I got:

 (call_insn 23 22 24 2 (set (parallel [
  (expr_list:REG_DEP_TRUE (reg:DI 0 ax)
  (const_int 0 [0]))
  (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
  (const_int 0 [0]))
  (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
  (const_int 0 [0]))
  ])
  (call/j (mem:QI (symbol_ref:DI ("memset") [flags 0x41]
>>>
>>> [ snip ]
>>> Looks good.  This is the approved way to handle multiple results of a call.
>>>

 During register allocation LRA generated a weird move instruction:

 (insn 63 0 0 (set (reg/f:DI 100)
  (parallel [
  (expr_list:REG_DEP_TRUE (reg:DI 0 ax)
  (const_int 0 [0]))
  (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
  (const_int 0 [0]))
  (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
  (const_int 0 [0]))
  ])) -1
   (nil))

 Which caused ICE later in LRA.  This move happens because of
 REG_RETURNED (reg/f:DI 100) (see condition in inherit_in_ebb at
 lra-constraints.c:5312).  Thus this code in LRA doesn't accept
 PARALLEL dest for calls.
>>>
>>> This is a bug in LRA then.  Multiple return values aren't heavily used, so
>>> I'm not surprised that its handling was missed in LRA.
>>>
>>> The question now is how to bundle things together in such a way as to make
>>> it easy for Vlad to reproduce and fix this in LRA.
>>>
>>> Jeff
>>
>> I suppose it should be easy to reproduce using the same test case I
>> use and some speudo patch which adds fake return values (e.g. xmm6 and
>> xmm7) to calls.  Will try to make some minimal patch and test Vlad
>> could work with.
>>
>> Ilya
>
> I couldn't reproduce the problem on a small test but chrome build
> shows a lot of errors.  Due to the nature of the problem test's size
> shouldn't matter, so I attach patch which emulates situation with
> bounds regs (but uses xmm5 and xmm6 instead of bnd0 and bnd1) with a
> preprocessed chrome file.
>
> I apply patch to revision 215580.
>
> Command to reproduce:
>
>>g++ -O2 -c generated_message_reflection.ii
> ../../third_party/protobuf/src/google/protobuf/generated_message_reflection.cc:
> In member function 'virtual void
> google::protobuf::internal::GeneratedMessageReflection::AddBool(google::protobuf::Message*,
> const google::protobuf::FieldDescriptor*, bool) const':
> ../../third_party/protobuf/src/google/protobuf/generated_message_reflection.cc:726:3910:
> internal compiler error: in lra_set_insn_recog_data, at lra.c:941
> 0xc7b969 lra_set_insn_recog_data(rtx_insn*)
> ../../gcc-pl/gcc/lra.c:939
> 0xc79822 lra_get_insn_recog_data
> ../../gcc-pl/gcc/lra-int.h:473
> 0xc7d426 lra_update_insn_regno_info(rtx_insn*)
> ../../gcc-pl/gcc/lra.c:1600
> 0xc7d690 lra_push_insn_1
> ../../gcc-pl/gcc/lra.c:1653
> 0xc7d6c0 lra_push_insn(rtx_insn*)
> ../../gcc-pl/gcc/lra.c:1661
> 0xc7d7bf push_insns
> ../../gcc-pl/gcc/lra.c:1704
> 0xc7da47 lra_process_new_insns(rtx_insn*, rtx_insn*, rtx_insn*, char const*)
> ../../gcc-pl/gcc/lra.c:1758
> 0xc94c80 inherit_in_ebb
> ../../gcc-pl/gcc/lra-constraints.c:5356
> 0xc9599c lra_inheritance()
> ../../gcc-pl/gcc/lra-constraints.c:5560
> 0xc7e86c lra(_IO_FILE*)
> ../../gcc-pl/gcc/lra.c:2223
> 0xc2eab8 do_reload
> ../../gcc-pl/gcc/ira.c:5311
> 0xc2edfe execute
> ../../gcc-pl/gcc/ira.c:5470
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
>
> The problem as I see it is that in lra-constraints.c:5352 we do not
> check call dest is actually a register.  But probably REG_RETURNED
> shouldn't be applied to such call because it is not clear to which
> return value it applies.
>
> Thanks,
> Ilya


Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-09-26 Thread Ilya Enkovich
Ping

2014-06-06 10:54 GMT+04:00 Ilya Enkovich :
> On 03 Jun 10:59, Richard Biener wrote:
>> On Mon, Jun 2, 2014 at 5:13 PM, Ilya Enkovich  wrote:
>> > 2014-06-02 17:37 GMT+04:00 Richard Biener :
>> >> On Mon, Jun 2, 2014 at 2:44 PM, Ilya Enkovich  
>> >> wrote:
>> >>> 2014-06-02 15:35 GMT+04:00 Richard Biener :
>>  On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich  
>>  wrote:
>> > Hi,
>> >
>> > This patch adds Pointer Bounds Checker passes.  Versioning happens 
>> > before early local passes.  Earply local passes are split into 3 
>> > stages to have everything instrumented before any optimization applies.
>> 
>>  That looks artificial to me.  If you need to split up early_local_passes
>>  then do that - nesting three IPA pass groups inside it looks odd to me.
>>  Btw - doing this in three "IPA phases" makes things possibly slower
>>  due to cache effects.  It might be worth pursuing to move the early
>>  stage completely to the lowering pipeline.
>> >>>
>> >>> Early local passes is some special case because these passes are
>> >>> executed separately for new functions. I did not want to get three
>> >>> special passes instead and therefore made split inside.
>> >>
>> >> Yeah, but all passes are already executed via execute_early_local_passes,
>> >> so it would be only an implementation detail.
>> >>
>> >>> If you prefer split pass itself, I suppose pass_early_local_passes may
>> >>> be replaced with something like pass_build_ssa_passes +
>> >>> pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks +
>> >>> pass_local_optimization_passes. execute_early_local_passes would
>> >>> execute gimple passes lists of pass_build_ssa_passes,
>> >>> pass_chkp_instrumentation_passes and pass_local_optimization_passes.
>> >>>
>> >>> I think we cannot have the first stage moved into lowering passes
>> >>> because it should be executed for newly created functions.
>> >>
>> >> Well, let's defer that then.
>> >>
>> 
>>  Btw, fixup_cfg only needs to run once local_pure_const was run
>>  on a callee, thus it shouldn't be neccessary to run it from the
>>  first group.
>> >>>
>> >>> OK. Will try to remove it from the first group.
>> >>>
>> 
>>   void
>>   pass_manager::execute_early_local_passes ()
>>   {
>>  -  execute_pass_list (pass_early_local_passes_1->sub);
>>  +  execute_pass_list (pass_early_local_passes_1->sub->sub);
>>  +  execute_pass_list (pass_early_local_passes_1->sub->next->sub);
>>  +  execute_pass_list 
>>  (pass_early_local_passes_1->sub->next->next->next->sub);
>>   }
>> 
>>  is gross - it should be enough to execute the early local pass list
>>  (obsolete comment with the suggestion above).
>> >>>
>> >>> This function should call only gimple passes for cfun. Therefore we
>> >>> cannot call IPA passes here and has to execute each gimple passes list
>> >>> separately.
>> >>
>> >> Ok, given a different split this would then become somewhat more sane
>> >> anyway.
>> >
>> > Sorry, didn't catch it. Should I try a different split or defer it? :)
>>
>> Please try a different split.  Defer moving the first part to the
>> lowering stage.
>>
>> Richard.
>>
>
> Here is a new version with new split with pass_early_local_passes replaced 
> with new three passes.  Left execute_early_local_passes unrenamed.  Had to 
> fix test gcc.dg/pr37858.c which uses -fdump-ipa-early_local_cleanups option.
>
> I could not get rid of additional pass_fixup_cfg.  Its removal caused wrong 
> CFG (call to nonreturn function in a middle of BB) which confused checker 
> logic.  I moved this pass into checker passes list.
>
> Bootstrapped and tested on linux-x86_64.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2014-06-05  Ilya Enkovich  
>
> * tree-chkp.c: New.
> * tree-chkp.h: New.
> * rtl-chkp.c: New.
> * rtl-chkp.h: New.
> * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
> (GTFILES): Add tree-chkp.c.
> * c-family/c.opt (fchkp-check-incomplete-type): New.
> (fchkp-zero-input-bounds-for-main): New.
> (fchkp-first-field-has-own-bounds): New.
> (fchkp-narrow-bounds): New.
> (fchkp-narrow-to-innermost-array): New.
> (fchkp-optimize): New.
> (fchkp-use-fast-string-functions): New.
> (fchkp-use-nochk-string-functions): New.
> (fchkp-use-static-bounds): New.
> (fchkp-use-static-const-bounds): New.
> (fchkp-treat-zero-dynamic-size-as-infinite): New.
> (fchkp-check-read): New.
> (fchkp-check-write): New.
> (fchkp-store-bounds): New.
> (fchkp-instrument-calls): New.
> (fchkp-instrument-marked-only): New.
> * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
> __CHKP__ macro when Pointer Bounds Checker is on.
> * passes.def (pass_ipa_chkp_versioning): New.
> (pass_early_local_passes): Removed.
> (pass_build_

Re: [PATCH 2/5] Existing call graph infrastructure enhancement

2014-09-26 Thread Martin Liška

On 09/24/2014 05:01 PM, Jan Hubicka wrote:

Hi.

Following patch enhances API functions to be ready for main patch of this 
patchset.

Ready for thunk?

Thank you,
Martin



gcc/ChangeLog:

2014-09-21  Martin Liška  

* cgraph.c (cgraph_node::release_body): New argument keep_arguments
introduced.
* cgraph.h: Likewise.
* cgraphunit.c (cgraph_node::create_wrapper): Usage of new argument 
introduced.
* ipa-devirt.c (polymorphic_type_binfo_p): Safe check for binfos 
created by Java.
* tree-ssa-alias.c (ao_ref_base_alias_set): Static function transformed 
to global.
* tree-ssa-alias.h: Likewise.



diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 8f04284..d40a2922 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1637,13 +1637,15 @@ release_function_body (tree decl)
 are free'd in final.c via free_after_compilation().  */

  void
-cgraph_node::release_body (void)
+cgraph_node::release_body (bool keep_arguments)
  {
ipa_transforms_to_apply.release ();
if (!used_as_abstract_origin && symtab->state != PARSING)
  {
DECL_RESULT (decl) = NULL;
-  DECL_ARGUMENTS (decl) = NULL;
+
+  if (!keep_arguments)
+   DECL_ARGUMENTS (decl) = NULL;
  }
/* If the node is abstract and needed, then do not clear DECL_INITIAL
   of its associated function function declaration because it's
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index a316e40..19ce3b8 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -915,7 +915,7 @@ public:
   Use this only for functions that are released before being translated to
   target code (i.e. RTL).  Functions that are compiled to RTL and beyond
   are free'd in final.c via free_after_compilation().  */
-  void release_body (void);
+  void release_body (bool keep_arguments = false);


Please add documentation for KEEP_ARGUMENTS explaining that it is useful only 
if you want to
rebuild body as thunk.


/* cgraph_node is no longer nested function; update cgraph accordingly.  */
void unnest (void);
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 3e3b8d2..c4597e2 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2300,7 +2300,7 @@ cgraph_node::create_wrapper (cgraph_node *target)
  tree decl_result = DECL_RESULT (decl);

  /* Remove the function's body.  */

I would say Remove the function's body but keep arguments to be reused for 
thunk.

-release_body ();
+release_body (true);
  reset ();

  DECL_RESULT (decl) = decl_result;
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index af42c6d..f374933 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -225,7 +225,7 @@ static inline bool
  polymorphic_type_binfo_p (tree binfo)
  {
/* See if BINFO's type has an virtual table associtated with it.  */
-  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
+  return BINFO_TYPE (binfo) && BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));


Aha, this change was for Java, right? Please add comment that Java produces
BINFOs without BINFO_TYPE set.

  }

  /* Return TRUE if all derived types of T are known and thus
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 442112a..1bf88e2 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -559,7 +559,7 @@ ao_ref_base (ao_ref *ref)

  /* Returns the base object alias set of the memory reference *REF.  */

-static alias_set_type
+alias_set_type
  ao_ref_base_alias_set (ao_ref *ref)
  {
tree base_ref;
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 436381a..0d35283 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -98,6 +98,7 @@ extern void ao_ref_init (ao_ref *, tree);
  extern void ao_ref_init_from_ptr_and_size (ao_ref *, tree, tree);
  extern tree ao_ref_base (ao_ref *);
  extern alias_set_type ao_ref_alias_set (ao_ref *);
+extern alias_set_type ao_ref_base_alias_set (ao_ref *);


I can not approve this change, but I suppose it is what Richard suggested?



There's updated version of the patch that deals with Honza's notes.
Yes, I explicitly asked Richard if we can mark the function as global.

I will commit the patch soon.

Thank you,
Martin


Patch is OK except for the tree-ssa-alias bits.
Honza

  extern bool ptr_deref_may_alias_global_p (tree);
  extern bool ptr_derefs_may_alias_p (tree, tree);
  extern bool ref_may_alias_global_p (tree);




diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 1cfc783..fdcaf79 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1625,16 +1625,19 @@ release_function_body (tree decl)
 /* Release memory used to represent body of function.
Use this only for functions that are released before being translated to
target code (i.e. RTL).  Functions that are compiled to RTL and beyond
-   are free'd in final.c via free_after_compilation().  */
+   are free'd in final.c via free_after_compilation().
+   KEEP_ARGUMENTS are useful only if you want to rebuild body as thunk.  */
 
 void
-cgraph_node::release_body (void)
+cgraph_node::rele

[PATCH i386 AVX512] [58/n] Add vpmul[u]dq insn patterns.

2014-09-26 Thread Kirill Yukhin
Hello,
Patch in the bottom adds support for vpmul[u]dq insn
patterns.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_expand "vec_widen_umult_even_v8si"): Add masking.
(define_insn "*vec_widen_umult_even_v8si"): Ditto.
(define_expand "vec_widen_umult_even_v4si"): Ditto.
(define_insn "*vec_widen_umult_even_v4si"): Ditto.
(define_expand "vec_widen_smult_even_v8si"): Ditto.
(define_insn "*vec_widen_smult_even_v8si"): Ditto.
(define_expand "sse4_1_mulv2siv2di3"): Ditto.
(define_insn "*sse4_1_mulv2siv2di3"): Ditto.
(define_insn "avx512dq_mul3"): New.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 43d6655..e52d40c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -9286,7 +9286,7 @@
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
 
-(define_expand "vec_widen_umult_even_v8si"
+(define_expand "vec_widen_umult_even_v8si"
   [(set (match_operand:V4DI 0 "register_operand")
(mult:V4DI
  (zero_extend:V4DI
@@ -9299,29 +9299,30 @@
  (match_operand:V8SI 2 "nonimmediate_operand")
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])]
-  "TARGET_AVX2"
+  "TARGET_AVX2 && "
   "ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);")
 
-(define_insn "*vec_widen_umult_even_v8si"
-  [(set (match_operand:V4DI 0 "register_operand" "=x")
+(define_insn "*vec_widen_umult_even_v8si"
+  [(set (match_operand:V4DI 0 "register_operand" "=v")
(mult:V4DI
  (zero_extend:V4DI
(vec_select:V4SI
- (match_operand:V8SI 1 "nonimmediate_operand" "%x")
+ (match_operand:V8SI 1 "nonimmediate_operand" "%v")
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])))
  (zero_extend:V4DI
(vec_select:V4SI
- (match_operand:V8SI 2 "nonimmediate_operand" "xm")
+ (match_operand:V8SI 2 "nonimmediate_operand" "vm")
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])]
-  "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V8SImode, operands)"
-  "vpmuludq\t{%2, %1, %0|%0, %1, %2}"
+  "TARGET_AVX2 && 
+   && ix86_binary_operator_ok (MULT, V8SImode, operands)"
+  "vpmuludq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseimul")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "OI")])
 
-(define_expand "vec_widen_umult_even_v4si"
+(define_expand "vec_widen_umult_even_v4si"
   [(set (match_operand:V2DI 0 "register_operand")
(mult:V2DI
  (zero_extend:V2DI
@@ -9332,28 +9333,29 @@
(vec_select:V2SI
  (match_operand:V4SI 2 "nonimmediate_operand")
  (parallel [(const_int 0) (const_int 2)])]
-  "TARGET_SSE2"
+  "TARGET_SSE2 && "
   "ix86_fixup_binary_operands_no_copy (MULT, V4SImode, operands);")
 
-(define_insn "*vec_widen_umult_even_v4si"
-  [(set (match_operand:V2DI 0 "register_operand" "=x,x")
+(define_insn "*vec_widen_umult_even_v4si"
+  [(set (match_operand:V2DI 0 "register_operand" "=x,v")
(mult:V2DI
  (zero_extend:V2DI
(vec_select:V2SI
- (match_operand:V4SI 1 "nonimmediate_operand" "%0,x")
+ (match_operand:V4SI 1 "nonimmediate_operand" "%0,v")
  (parallel [(const_int 0) (const_int 2)])))
  (zero_extend:V2DI
(vec_select:V2SI
- (match_operand:V4SI 2 "nonimmediate_operand" "xm,xm")
+ (match_operand:V4SI 2 "nonimmediate_operand" "xm,vm")
  (parallel [(const_int 0) (const_int 2)])]
-  "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V4SImode, operands)"
+  "TARGET_SSE2 && 
+   && ix86_binary_operator_ok (MULT, V4SImode, operands)"
   "@
pmuludq\t{%2, %0|%0, %2}
-   vpmuludq\t{%2, %1, %0|%0, %1, %2}"
+   vpmuludq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
(set_attr "type" "sseimul")
(set_attr "prefix_data16" "1,*")
-   (set_attr "prefix" "orig,vex")
+   (set_attr "prefix" "orig,maybe_evex")
(set_attr "mode" "TI")])
 
 (define_expand "vec_widen_smult_even_v16si"
@@ -9401,7 +9403,7 @@
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
 
-(define_expand "vec_widen_smult_even_v8si"
+(define_expand "vec_widen_smult_even_v8si"
   [(set (match_operand:V4DI 0 "register_operand")
(mult:V4DI
  (sign_extend:V4DI
@@ -9414,30 +9416,31 @@
  (match_operand:V8SI 2 "nonimmediate_operand")
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])]
-  "TARGET_AVX2"
+  "TARGET_AVX2 && "
   "ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);")
 
-(define_insn "*vec_widen_smult_even_v8si"
-  [(set (match_operand:V4DI 0 "register_ope

[PATCH i386 AVX512] [59/n] Add vptest[n]m, ucmp, cmpeq insn patterns.

2014-09-26 Thread Kirill Yukhin
Hello,
Patch in the bottom adds support for vptest[n]m, ucmp, cmpeq.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/i386.c
(ix86_expand_args_builtin): Handle CODE_FOR_avx512vl_cmpv4di3_mask,
CODE_FOR_avx512vl_cmpv8si3_mask, CODE_FOR_avx512vl_ucmpv4di3_mask,
CODE_FOR_avx512vl_ucmpv8si3_mask, CODE_FOR_avx512vl_cmpv2di3_mask,
CODE_FOR_avx512vl_cmpv4si3_mask, CODE_FOR_avx512vl_ucmpv2di3_mask,
CODE_FOR_avx512vl_ucmpv4si3_mask.
* config/i386/sse.md
(define_insn
(define_insn "avx512f_ucmp3"): Delete.
"_ucmp3"):New.
(define_insn
"_ucmp3"):Ditto.
(define_expand "_eq3"): Ditto.
(define_insn "_eq3_1"): Ditto.
(define_insn "_gt3"): Ditto.
(define_insn "_testm3"): Ditto.
(define_insn "_testnm3"): Ditto.

--
Thanks, K

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1aec70f..352ab81 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -34062,6 +34062,14 @@ ix86_expand_args_builtin (const struct 
builtin_description *d,
  case CODE_FOR_avx512f_cmpv16si3_mask:
  case CODE_FOR_avx512f_ucmpv8di3_mask:
  case CODE_FOR_avx512f_ucmpv16si3_mask:
+ case CODE_FOR_avx512vl_cmpv4di3_mask:
+ case CODE_FOR_avx512vl_cmpv8si3_mask:
+ case CODE_FOR_avx512vl_ucmpv4di3_mask:
+ case CODE_FOR_avx512vl_ucmpv8si3_mask:
+ case CODE_FOR_avx512vl_cmpv2di3_mask:
+ case CODE_FOR_avx512vl_cmpv4si3_mask:
+ case CODE_FOR_avx512vl_ucmpv2di3_mask:
+ case CODE_FOR_avx512vl_ucmpv4si3_mask:
error ("the last argument must be a 3-bit immediate");
return const0_rtx;
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e52d40c..625a2e0 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -2517,11 +2517,25 @@
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_insn "avx512f_ucmp3"
+(define_insn "_ucmp3"
   [(set (match_operand: 0 "register_operand" "=Yk")
(unspec:
- [(match_operand:VI48_512 1 "register_operand" "v")
-  (match_operand:VI48_512 2 "nonimmediate_operand" "vm")
+ [(match_operand:VI12_AVX512VL 1 "register_operand" "v")
+  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
+  (match_operand:SI 3 "const_0_to_7_operand" "n")]
+ UNSPEC_UNSIGNED_PCMP))]
+  "TARGET_AVX512BW"
+  "vpcmpu\t{%3, %2, %1, 
%0|%0, %1, %2, %3}"
+  [(set_attr "type" "ssecmp")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "_ucmp3"
+  [(set (match_operand: 0 "register_operand" "=Yk")
+   (unspec:
+ [(match_operand:VI48_AVX512VL 1 "register_operand" "v")
+  (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm")
   (match_operand:SI 3 "const_0_to_7_operand" "n")]
  UNSPEC_UNSIGNED_PCMP))]
   "TARGET_AVX512F"
@@ -10265,20 +10279,42 @@
(set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
-(define_expand "avx512f_eq3"
+(define_expand "_eq3"
+  [(set (match_operand: 0 "register_operand")
+   (unspec:
+ [(match_operand:VI12_AVX512VL 1 "register_operand")
+  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand")]
+ UNSPEC_MASKED_EQ))]
+  "TARGET_AVX512BW"
+  "ix86_fixup_binary_operands_no_copy (EQ, mode, operands);")
+
+(define_expand "_eq3"
   [(set (match_operand: 0 "register_operand")
(unspec:
- [(match_operand:VI48_512 1 "register_operand")
-  (match_operand:VI48_512 2 "nonimmediate_operand")]
+ [(match_operand:VI48_AVX512VL 1 "register_operand")
+  (match_operand:VI48_AVX512VL 2 "nonimmediate_operand")]
  UNSPEC_MASKED_EQ))]
   "TARGET_AVX512F"
   "ix86_fixup_binary_operands_no_copy (EQ, mode, operands);")
 
-(define_insn "avx512f_eq3_1"
+(define_insn "_eq3_1"
   [(set (match_operand: 0 "register_operand" "=Yk")
(unspec:
- [(match_operand:VI48_512 1 "register_operand" "%v")
-  (match_operand:VI48_512 2 "nonimmediate_operand" "vm")]
+ [(match_operand:VI12_AVX512VL 1 "register_operand" "%v")
+  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")]
+ UNSPEC_MASKED_EQ))]
+  "TARGET_AVX512F && ix86_binary_operator_ok (EQ, mode, operands)"
+  "vpcmpeq\t{%2, %1, 
%0|%0, %1, %2}"
+  [(set_attr "type" "ssecmp")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "_eq3_1"
+  [(set (match_operand: 0 "register_operand" "=Yk")
+   (unspec:
+ [(match_operand:VI48_AVX512VL 1 "register_operand" "%v")
+  (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm")]
  UNSPEC_MASKED_EQ))]
   "TARGET_AVX512F && ix86_binary_operator_ok (EQ, mode, operands)"
   "vpcmpeq\t{%2, %

[PATCH][match-and-simplify] Enable conversions properly for GENERIC

2014-09-26 Thread Richard Biener

This fixes a ??? and handles NOP_EXPR and CONVERT_EXPR matching
similar to GIMPLE by using CASE_CONVERT.  This uncovers a
mismatch between fold-const.c and tree-ssa-forwprop.c transforms
which try to do opposite things with ((T) X) & CST vs.
(T) (X & CST).  I have disabled the forwprop transform on GENERIC.
It also uncovers that the C++ FE uses a mix of NOP_EXPR and
CONVERT_EXPR both when building expressions and when checking
for them.  Jason - is there any difference between NOP_EXPR and
CONVERT_EXPR as far as the C++ FE is concerned?  I have
silenced -Wsign-compare warnings that the patch caused by
making enum_cast_to_int "accept" both NOP_EXPR and CONVERT_EXPR
as conversion code (ok for trunk?).

Bootstrap and testing on x86_64-unknown-linux-gnu in progress.

Richard.

2014-09-26  Richard Biener  

* genmatch.c (dt_node::gen_kids): Handle conversions in
generic expressions properly.
* match-bitwise.pd ((type) X & CST -> (type) (X & ((type-x) CST))):
Disable on GENERIC as it conflicts with a transform in fold-const.c.

cp/
* typeck.c (enum_cast_to_int): Use CONVERT_EXPR_P to check
for conversions.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215638)
+++ gcc/genmatch.c  (working copy)
@@ -1719,8 +1719,10 @@ dt_node::gen_kids (FILE *f, bool gimple)
 {
   expr *e = as_a (generic_exprs[i]->op);
   id_base *op = e->operation;
-  /* ??? CONVERT */
-  fprintf (f, "case %s:\n", op->id);
+  if (*op == CONVERT_EXPR || *op == NOP_EXPR)
+   fprintf (f, "CASE_CONVERT:\n");
+  else
+   fprintf (f, "case %s:\n", op->id);
   fprintf (f, "{\n");
   generic_exprs[i]->gen (f, gimple);
   fprintf (f, "break;\n"
Index: gcc/match-bitwise.pd
===
--- gcc/match-bitwise.pd(revision 215554)
+++ gcc/match-bitwise.pd(working copy)
@@ -28,7 +28,13 @@ along with GCC; see the file COPYING3.
   (bitop (convert @0) (convert? @1))
   (if (((TREE_CODE (@1) == INTEGER_CST
 && INTEGRAL_TYPE_P (TREE_TYPE (@0))
-&& int_fits_type_p (@1, TREE_TYPE (@0)))
+&& int_fits_type_p (@1, TREE_TYPE (@0))
+/* ???  This transform conflicts with fold-const.c doing
+   Convert (T)(x & c) into (T)x & (T)c, if c is an integer
+   constants (if x has signed type, the sign bit cannot be set
+   in c).  This folds extension into the BIT_AND_EXPR.
+   Restrict it to GIMPLE to avoid endless recursions.  */
+&& (bitop != BIT_AND_EXPR || GIMPLE))
|| types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
&& (/* That's a good idea if the conversion widens the operand, thus
  after hoisting the conversion the operation will be narrower.  */
Index: gcc/cp/typeck.c
===
--- gcc/cp/typeck.c (revision 215554)
+++ gcc/cp/typeck.c (working copy)
@@ -3858,7 +3858,7 @@ build_x_array_ref (location_t loc, tree
 static bool
 enum_cast_to_int (tree op)
 {
-  if (TREE_CODE (op) == NOP_EXPR
+  if (CONVERT_EXPR_P (op)
   && TREE_TYPE (op) == integer_type_node
   && TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == ENUMERAL_TYPE
   && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op, 0


[PATCH i386 AVX512] [60/n] Update 128bit ashrv insn pattern.

2014-09-26 Thread Kirill Yukhin
Hello,
This tiny patch extends 128bit ashrv expander.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI128_128 [V16QI V8HI V2DI]): Delete.
(define_expand "vashr3"): Add masking,
use VI12_128 mode iterator.
(define_expand "ashrv2di3"): New.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 625a2e0..91d6778 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -498,7 +498,6 @@
 (define_mode_iterator VI12_128 [V16QI V8HI])
 (define_mode_iterator VI14_128 [V16QI V4SI])
 (define_mode_iterator VI124_128 [V16QI V8HI V4SI])
-(define_mode_iterator VI128_128 [V16QI V8HI V2DI])
 (define_mode_iterator VI24_128 [V8HI V4SI])
 (define_mode_iterator VI248_128 [V8HI V4SI V2DI])
 (define_mode_iterator VI48_128 [V4SI V2DI])
@@ -15720,17 +15719,36 @@
  (match_operand:VI48_256 2 "nonimmediate_operand")))]
   "TARGET_AVX2")
 
-(define_expand "vashr3"
-  [(set (match_operand:VI128_128 0 "register_operand")
-   (ashiftrt:VI128_128
- (match_operand:VI128_128 1 "register_operand")
- (match_operand:VI128_128 2 "nonimmediate_operand")))]
-  "TARGET_XOP"
+(define_expand "vashr3"
+  [(set (match_operand:VI12_128 0 "register_operand")
+   (ashiftrt:VI12_128
+ (match_operand:VI12_128 1 "register_operand")
+ (match_operand:VI12_128 2 "nonimmediate_operand")))]
+  "TARGET_XOP || (TARGET_AVX512BW && TARGET_AVX512VL)"
 {
-  rtx neg = gen_reg_rtx (mode);
-  emit_insn (gen_neg2 (neg, operands[2]));
-  emit_insn (gen_xop_sha3 (operands[0], operands[1], neg));
-  DONE;
+  if (TARGET_XOP)
+{
+  rtx neg = gen_reg_rtx (mode);
+  emit_insn (gen_neg2 (neg, operands[2]));
+  emit_insn (gen_xop_sha3 (operands[0], operands[1], neg));
+  DONE;
+}
+})
+
+(define_expand "vashrv2di3"
+  [(set (match_operand:V2DI 0 "register_operand")
+   (ashiftrt:V2DI
+ (match_operand:V2DI 1 "register_operand")
+ (match_operand:V2DI 2 "nonimmediate_operand")))]
+  "TARGET_XOP || TARGET_AVX512VL"
+{
+  if (!TARGET_XOP)
+{
+  rtx neg = gen_reg_rtx (V2DImode);
+  emit_insn (gen_negv2di2 (neg, operands[2]));
+  emit_insn (gen_xop_shav2di3 (operands[0], operands[1], neg));
+  DONE;
+}
 })
 
 (define_expand "vashrv4si3"


Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Bernd Schmidt

On 09/16/2014 02:59 PM, Richard Biener wrote:

On Tue, Sep 16, 2014 at 1:24 PM, Bernd Schmidt  wrote:

Ok, so testing seems to show that nothing breaks with the ARRAY_TYPE special
case removed. However, I remembered another reason to do this, and it's for
consistency with how address spaces are represented in other parts of the
compiler - specifically, the C frontend.

C has the notion that arrays don't have type qualifiers, so to get the
address space of an array you'd have to look at the address space of its
element types. Joseph has in the past rejected patches to fix this
inconsistency. For other types like structs or vectors (as we saw in the
tree-vect patch) it's the outermost type that has the address space
information.

I guess I'll declare myself agnostic, let me know whatever variant you want
to have here (fixing up all types or not fixing arrays) and I'll make a new
patch.


Hmm.  How is it with other compositive types like vectors and complex?
It's bad that the middle-end needs to follow a specific frontends need.
Why's the representation tied so closely together?

OTOH that address-spaces are "qualifiers" is an implementation detail
(and maybe not the very best).  So I don't see how the C frontend
needs to view them as qualifiers?


So what's the conclusion here? What should I be doing with the patch?


Bernd



Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Richard Biener
On Fri, Sep 26, 2014 at 2:00 PM, Bernd Schmidt  wrote:
> On 09/16/2014 02:59 PM, Richard Biener wrote:
>>
>> On Tue, Sep 16, 2014 at 1:24 PM, Bernd Schmidt 
>> wrote:
>>>
>>> Ok, so testing seems to show that nothing breaks with the ARRAY_TYPE
>>> special
>>> case removed. However, I remembered another reason to do this, and it's
>>> for
>>> consistency with how address spaces are represented in other parts of the
>>> compiler - specifically, the C frontend.
>>>
>>> C has the notion that arrays don't have type qualifiers, so to get the
>>> address space of an array you'd have to look at the address space of its
>>> element types. Joseph has in the past rejected patches to fix this
>>> inconsistency. For other types like structs or vectors (as we saw in the
>>> tree-vect patch) it's the outermost type that has the address space
>>> information.
>>>
>>> I guess I'll declare myself agnostic, let me know whatever variant you
>>> want
>>> to have here (fixing up all types or not fixing arrays) and I'll make a
>>> new
>>> patch.
>>
>>
>> Hmm.  How is it with other compositive types like vectors and complex?
>> It's bad that the middle-end needs to follow a specific frontends need.
>> Why's the representation tied so closely together?
>>
>> OTOH that address-spaces are "qualifiers" is an implementation detail
>> (and maybe not the very best).  So I don't see how the C frontend
>> needs to view them as qualifiers?
>
>
> So what's the conclusion here? What should I be doing with the patch?

If currently address-space support matches up with the C frontend
and the C standard then the middle-end has to cope with that.
In this case, cope with array element types not having address-space
qualifiers.

Richard.

>
> Bernd
>


Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Bernd Schmidt

On 09/26/2014 02:05 PM, Richard Biener wrote:

If currently address-space support matches up with the C frontend
and the C standard then the middle-end has to cope with that.
In this case, cope with array element types not having address-space
qualifiers.


That's the opposite of what happens. The C frontend makes array element 
types have address-space qualifiers but not the array type.



Bernd




[PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-26 Thread Bernd Edlinger
Hi,

this patch fixes PR58893, which is an access to uninitialized memory, which may 
or may not crash in
linemap_resolve_location, or just print error messages with bogus location.

When the first -include file is processed we have the case, where
pfile->cur_token == pfile->cur_run->base, this is directly called
by the front end. However in the case of the second -include file,
this is called from  _cpp_lex_token -> _cpp_get_fresh_line ->
cpp_push_include, with pfile->cur_token != pfile->cur_run->base,
and pfile->cur_token[-1].src_loc and token not (yet) initialized.
The problem is, when the include file cannot be found, we need
src_loc to be initialized to some safe value: 0 means UNKNOWN_LOCATION.

Regarding the hunk in cpp_diagnostic, which is not directly involved
in this bug, but it is still obviously wrong:

The line "src_loc = pfile->cur_run->prev->limit->src_loc"
is probably unreachable, but will crash it is ever executed.

see:

_cpp_init_tokenrun (tokenrun *run, unsigned int count)
{
  run->base = XNEWVEC (cpp_token, count);
  run->limit = run->base + count;
  run->next = NULL;
}

so, limit points at the end of the run.


Boot-Strapped and Regression-tested on x86_64-linux-gnu
Ok for trunk?


Thanks
Bernd.
  

Re: [PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-26 Thread Marek Polacek
On Fri, Sep 26, 2014 at 02:16:05PM +0200, Bernd Edlinger wrote:
> Boot-Strapped and Regression-tested on x86_64-linux-gnu
> Ok for trunk?

-ENOPATCH.

Marek


Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Martin Liška

On 07/17/2014 05:05 PM, Martin Liška wrote:


On 07/06/2014 12:53 AM, Jan Hubicka wrote:

On Fri, 20 Jun 2014, Trevor Saunders wrote:

+@item -fipa-icf
+@opindex fipa-icf
+Perform Identical Code Folding for functions and read-only variables.

I would perhaps explicitly say that the optimizations reduce code size
and may disturb unwind stacks by replacing a function by equivalent
one with different name.

+Behavior is similar to Gold Linker ICF optimization. Symbols proved

Perhaps tell a bit more here. The optimization works more effectively with link
time optimization enabled and that the Gold and GCC ICF works on different
levels and thus are not equivalent optimizations - there are equivallences that
are found only by GCC and equivalences found only by Gold.


+as semantically equivalent are redirected to corresponding symbol. The pass

+sensitively decides for usage of alias, thunk or local redirection.
+This flag is enabled by default at @option{-O2}.

Probably at -Os too.

I found this a bit hard to read/understand.

Perhaps first describe what it does and then, before "This flag is
enabled..." note that "This is similar to the ICF optimization performed
by the Gold linker".
"Symbols proved" (plural) vs "to corresponding symbol" seems to miss
an an "a" as in "a corresponding symbol".  Alas, how is that one
determined?  Is this more "...are merged into one", from the user's
perspective?

What does it mean to "sensitively decide for usage of alias, thunk,
or local redirection"?

I think this is just a technical detail of the implementation.  I would not put 
that
into user manual.  It means that for some functions you can make alias, for 
others
you need thunk (so addresses stay different)

Gerald


Hello,
there's updated version of patch that newly uses devirtualization machinery 
to identify polymorphic types that can potentially break ICF (There are such 
examples in Firefox).

Apart from that, I did many small updates, incorporated Trevor's comments and I 
tried to improve documentation entry for the pass.
Patch has been tested for Firefox and Inkscape with LTO.

Thanks,
Martin


Hello.

After couple of weeks I spent with fixing new issues connected to the pass:
1) Inliner failed in case I created a thunk and release body of a function. In 
such situation we need to preserve DECL_ARGUMENTS. I added new argument for: 
cgraph_node::release_body.
2) Awkward error was hidden in libstdc++ test for trees, there were two 
functions having one argument that differs in one sub-template. Thank to 
Richard who helped me to fix alias set accuracy.
3) There was missing comparison for FIELD_DECLS (DECL_FIELD_BIT_OFFSET) which 
caused me miscompilation.
4) After discussion with Honza, we introduced new cgraph_node flag called 
icf_merged. The flag helps to fix verifier in cgraph_node::verify.

Current version of the patch can bootstrap on x86_64-linux. With following 
patch applied, there's not testcase regression.
I tried to build Firefox, Inkscape, GIMP and Chromium with LTO and patch 
applied and no regression has been observed.

Moreover, I discussed with Richard and the pass is capable of playing role in 
tree-ssa-tail-merge (according to first experiments). It can replace current 
usage of value numbering.

I hope we can apply the patch to the mainline in a short-term time window?

Thank you,
Martin

>From 53d20d0b0c209b50d385ee8d85d5a7ed4594d477 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Fri, 26 Sep 2014 13:51:47 +0200
Subject: [PATCH 1/3] IPA ICF: patch1

---
 gcc/Makefile.in  |2 +
 gcc/cgraph.c |   20 +-
 gcc/cgraph.h |2 +
 gcc/cgraphunit.c |2 +-
 gcc/common.opt   |   12 +
 gcc/doc/invoke.texi  |   16 +-
 gcc/ipa-icf-gimple.c |  384 +++
 gcc/ipa-icf.c| 2841 ++
 gcc/ipa-icf.h|  803 ++
 gcc/lto-cgraph.c |2 +
 gcc/lto-section-in.c |3 +-
 gcc/lto-streamer.h   |1 +
 gcc/opts.c   |6 +
 gcc/passes.def   |1 +
 gcc/timevar.def  |1 +
 gcc/tree-pass.h  |1 +
 16 files changed, 4089 insertions(+), 8 deletions(-)
 create mode 100644 gcc/ipa-icf-gimple.c
 create mode 100644 gcc/ipa-icf.c
 create mode 100644 gcc/ipa-icf.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3dd9d8f..8d02425 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1265,6 +1265,8 @@ OBJS = \
 	ipa-profile.o \
 	ipa-prop.o \
 	ipa-pure-const.o \
+	ipa-icf.o \
+	ipa-icf-gimple.o \
 	ipa-reference.o \
 	ipa-ref.o \
 	ipa-utils.o \
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index fdcaf79..439db49 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1913,6 +1913,8 @@ cgraph_node::dump (FILE *f)
 fprintf (f, " only_called_at_exit");
   if (tm_clone)
 fprintf (f, " tm_clone");
+  if (icf_merged)
+fprintf (f, " icf_merged");
   if (DECL_STATIC_CONSTRUCTOR (decl))
 fprintf (f," static_constructor (priority:%i)", get_init_priority ());
   if (DECL_STATIC_DESTRUCTO

FW: [PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-26 Thread Bernd Edlinger
Aehm, sorry., 

again, with patch files.


>
> Hi,
>
> this patch fixes PR58893, which is an access to uninitialized memory, which 
> may or may not crash in
> linemap_resolve_location, or just print error messages with bogus location.
>
> When the first -include file is processed we have the case, where
> pfile->cur_token == pfile->cur_run->base, this is directly called
> by the front end. However in the case of the second -include file,
> this is called from _cpp_lex_token -> _cpp_get_fresh_line ->
> cpp_push_include, with pfile->cur_token != pfile->cur_run->base,
> and pfile->cur_token[-1].src_loc and token not (yet) initialized.
> The problem is, when the include file cannot be found, we need
> src_loc to be initialized to some safe value: 0 means UNKNOWN_LOCATION.
>
> Regarding the hunk in cpp_diagnostic, which is not directly involved
> in this bug, but it is still obviously wrong:
>
> The line "src_loc = pfile->cur_run->prev->limit->src_loc"
> is probably unreachable, but will crash it is ever executed.
>
> see:
>
> _cpp_init_tokenrun (tokenrun *run, unsigned int count)
> {
> run->base = XNEWVEC (cpp_token, count);
> run->limit = run->base + count;
> run->next = NULL;
> }
>
> so, limit points at the end of the run.
>
>
> Boot-Strapped and Regression-tested on x86_64-linux-gnu
> Ok for trunk?
>
>
> Thanks
> Bernd.
>
  2014-09-26  Bernd Edlinger  

PR preprocessor/58893
* errors.c (cpp_diagnostic): Fix possible out of bounds access.
* files.c (_cpp_stack_include): Initialize src_loc for IT_CMDLINE.



patch-pr58893.diff
Description: Binary data


Re: [PATCH 4/5] Existing tests fix

2014-09-26 Thread Martin Liška

On 06/30/2014 02:11 PM, Martin Liška wrote:


On 06/17/2014 09:52 PM, Jeff Law wrote:

On 06/13/14 04:48, mliska wrote:

Hi,
   many tests rely on a precise number of scanned functions in a dump file. If 
IPA ICF decides to merge some function and(or) read-only variables, counts do 
not match.

Martin

Changelog:

2014-06-13  Martin Liska  
Honza Hubicka  

* c-c++-common/rotate-1.c: Text
* c-c++-common/rotate-2.c: New test.
* c-c++-common/rotate-3.c: Likewise.
* c-c++-common/rotate-4.c: Likewise.
* g++.dg/cpp0x/rv-return.C: Likewise.
* g++.dg/cpp0x/rv1n.C: Likewise.
* g++.dg/cpp0x/rv1p.C: Likewise.
* g++.dg/cpp0x/rv2n.C: Likewise.
* g++.dg/cpp0x/rv3n.C: Likewise.
* g++.dg/cpp0x/rv4n.C: Likewise.
* g++.dg/cpp0x/rv5n.C: Likewise.
* g++.dg/cpp0x/rv6n.C: Likewise.
* g++.dg/cpp0x/rv7n.C: Likewise.
* gcc.dg/ipa/ipacost-1.c: Likewise.
* gcc.dg/ipa/ipacost-2.c: Likewise.
* gcc.dg/ipa/ipcp-agg-6.c: Likewise.
* gcc.dg/ipa/remref-2a.c: Likewise.
* gcc.dg/ipa/remref-2b.c: Likewise.
* gcc.dg/pr46309-2.c: Likewise.
* gcc.dg/torture/ipa-pta-1.c: Likewise.
* gcc.dg/tree-ssa/andor-3.c: Likewise.
* gcc.dg/tree-ssa/andor-4.c: Likewise.
* gcc.dg/tree-ssa/andor-5.c: Likewise.
* gcc.dg/vect/no-vfa-pr29145.c: Likewise.
* gcc.dg/vect/vect-cond-10.c: Likewise.
* gcc.dg/vect/vect-cond-9.c: Likewise.
* gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise.
* gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise.
* gcc.dg/vect/vect-widen-mult-half-u8.c: Likewise.
* gcc.target/i386/bmi-1.c: Likewise.
* gcc.target/i386/bmi-2.c: Likewise.
* gcc.target/i386/pr56564-2.c: Likewise.
* g++.dg/opt/pr30965.C: Likewise.
* g++.dg/tree-ssa/pr19637.C: Likewise.
* gcc.dg/guality/csttest.c: Likewise.
* gcc.dg/ipa/iinline-4.c: Likewise.
* gcc.dg/ipa/iinline-7.c: Likewise.
* gcc.dg/ipa/ipa-pta-13.c: Likewise.

I know this is the least interesting part of your changes, but it's also simple 
and mechanical and thus trivial to review. Approved, but obviously don't 
install until the rest of your patch has been approved.

Similar changes for recently added tests or cases where you might improve ICF 
requiring similar tweaks to existing tests are pre-approved as well.

jeff


Hello,
I fixed few more tests and added correct ChangeLog message.

gcc/testsuite/ChangeLog

2014-06-30  Martin Liska  
 Honza Hubicka  

 * c-c++-common/rotate-1.c: Test fixed.
 * c-c++-common/rotate-2.c: Likewise.
 * c-c++-common/rotate-3.c: Likewise.
 * c-c++-common/rotate-4.c: Likewise.
 * g++.dg/cpp0x/rv-return.C: Likewise.
 * g++.dg/cpp0x/rv1n.C: Likewise.
 * g++.dg/cpp0x/rv1p.C: Likewise.
 * g++.dg/cpp0x/rv2n.C: Likewise.
 * g++.dg/cpp0x/rv3n.C: Likewise.
 * g++.dg/cpp0x/rv4n.C: Likewise.
 * g++.dg/cpp0x/rv5n.C: Likewise.
 * g++.dg/cpp0x/rv6n.C: Likewise.
 * g++.dg/cpp0x/rv7n.C: Likewise.
 * g++.dg/ipa/devirt-g-1.C: Likewise.
 * g++.dg/ipa/inline-1.C: Likewise.
 * g++.dg/ipa/inline-2.C: Likewise.
 * g++.dg/ipa/inline-3.C: Likewise.
 * g++.dg/opt/pr30965.C: Likewise.
 * g++.dg/tree-ssa/pr19637.C: Likewise.
 * gcc.dg/guality/csttest.c: Likewise.
 * gcc.dg/ipa/iinline-4.c: Likewise.
 * gcc.dg/ipa/iinline-7.c: Likewise.
 * gcc.dg/ipa/ipa-pta-13.c: Likewise.
 * gcc.dg/ipa/ipacost-1.c: Likewise.
 * gcc.dg/ipa/ipacost-2.c: Likewise.
 * gcc.dg/ipa/ipcp-agg-6.c: Likewise.
 * gcc.dg/ipa/remref-2a.c: Likewise.
 * gcc.dg/ipa/remref-2b.c: Likewise.
 * gcc.dg/pr46309-2.c: Likewise.
 * gcc.dg/torture/ipa-pta-1.c: Likewise.
 * gcc.dg/tree-ssa/andor-3.c: Likewise.
 * gcc.dg/tree-ssa/andor-4.c: Likewise.
 * gcc.dg/tree-ssa/andor-5.c: Likewise.
 * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
 * gcc.dg/vect/vect-cond-10.c: Likewise.
 * gcc.dg/vect/vect-cond-9.c: Likewise.
 * gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise.
 * gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise.
 * gcc.dg/vect/vect-widen-mult-half-u8.c: Likewise.
 * gcc.target/i386/bmi-1.c: Likewise.
 * gcc.target/i386/bmi-2.c: Likewise.
 * gcc.target/i386/pr56564-2.c: Likewise.

Thank you,
Martin



Hello.

There's updated version of the patch that fixes another issued connected to 
test suite.

Thanks,
Martin
>From e7818e646687c05e13a68828ef70fb41716a267c Mon Sep 17 00:00:00 2001
From: mliska 
Date: Fri, 26 Sep 2014 13:52:29 +0200
Subject: [PATCH 2/3] IPA ICF: patch2.

---
 gcc/testsuite/c-c++-common/rotate-1.c | 2 +-
 gcc/testsuite/c-c++-common/rotate-2.c | 2 +-
 gcc/testsuite/c-c++-common/rotate-3.c | 2 +-
 gcc/testsuite/c-c++-common/rotate-4.c | 2 +-
 gcc/testsuite/g++.dg/cpp0x/rv-return.C| 1 +
 gcc/testsuite/g++.dg/cpp0x/rv1n.C | 2 ++
 gcc/testsuite/g++.dg/cpp0x/rv1p.C | 1 +
 gcc/tests

Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Richard Biener
On Fri, Sep 26, 2014 at 2:14 PM, Bernd Schmidt  wrote:
> On 09/26/2014 02:05 PM, Richard Biener wrote:
>>
>> If currently address-space support matches up with the C frontend
>> and the C standard then the middle-end has to cope with that.
>> In this case, cope with array element types not having address-space
>> qualifiers.
>
>
> That's the opposite of what happens. The C frontend makes array element
> types have address-space qualifiers but not the array type.

Ah, ok.  Then the opposite way around ;)

Richard.

>
> Bernd
>
>


Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Bernd Schmidt

On 09/26/2014 02:26 PM, Richard Biener wrote:

On Fri, Sep 26, 2014 at 2:14 PM, Bernd Schmidt  wrote:

On 09/26/2014 02:05 PM, Richard Biener wrote:


If currently address-space support matches up with the C frontend
and the C standard then the middle-end has to cope with that.
In this case, cope with array element types not having address-space
qualifiers.



That's the opposite of what happens. The C frontend makes array element
types have address-space qualifiers but not the array type.


Ah, ok.  Then the opposite way around ;)


Ok, so that means that my original patch which updated the element types 
for arrays is in fact the way to go?



Bernd




[PATCH i386 AVX512] [61/n] Update FP logic insn patterns.

2014-09-26 Thread Kirill Yukhin
Hello,
This patch extends andnot and any_logic insn
patterns.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_insn "_andnot3"): Add masking,
use VF_128_256 mode iterator and update assembler emit code.
(define_insn "_andnot3"): New.
(define_expand "3"):
Add masking, use VF_128_256 mode iterator.
(define_expand "3"): New.
(define_insn "*3"):
Add masking, use VF_128_256 mode iterator and update assembler emit
code.
(define_insn "*3"): New.
(define_mode_attr avx512flogicsuff): Delete.
(define_insn "avx512f_"): Ditto.
(define_insn "*andnot3"): Update MODE_XI, MODE_OI,
MODE_TI.
(define_insn "3"): Ditto.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 91d6778..9835234 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -2687,15 +2687,15 @@
 ;;
 ;
 
-(define_insn "_andnot3"
-  [(set (match_operand:VF 0 "register_operand" "=x,v")
-   (and:VF
- (not:VF
-   (match_operand:VF 1 "register_operand" "0,v"))
- (match_operand:VF 2 "nonimmediate_operand" "xm,vm")))]
-  "TARGET_SSE"
+(define_insn "_andnot3"
+  [(set (match_operand:VF_128_256 0 "register_operand" "=x,v")
+   (and:VF_128_256
+ (not:VF_128_256
+   (match_operand:VF_128_256 1 "register_operand" "0,v"))
+ (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,vm")))]
+  "TARGET_SSE && "
 {
-  static char buf[32];
+  static char buf[128];
   const char *ops;
   const char *suffix;
 
@@ -2715,17 +2715,17 @@
   ops = "andn%s\t{%%2, %%0|%%0, %%2}";
   break;
 case 1:
-  ops = "vandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
+  ops = "vandn%s\t{%%2, %%1, %%0|%%0, 
%%1, %%2}";
   break;
 default:
   gcc_unreachable ();
 }
 
-  /* There is no vandnp[sd].  Use vpandnq.  */
-  if ( == 64)
+  /* There is no vandnp[sd] in avx512f.  Use vpandn[qd].  */
+  if ( && !TARGET_AVX512DQ)
 {
-  suffix = "q";
-  ops = "vpandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
+  suffix = GET_MODE_INNER (mode) == DFmode ? "q" : "d";
+  ops = "vpandn%s\t{%%2, %%1, %%0|%%0, 
%%1, %%2}";
 }
 
   snprintf (buf, sizeof (buf), ops, suffix);
@@ -2745,30 +2745,63 @@
   ]
   (const_string "")))])
 
-(define_expand "3"
+
+(define_insn "_andnot3"
+  [(set (match_operand:VF_512 0 "register_operand" "=v")
+   (and:VF_512
+ (not:VF_512
+   (match_operand:VF_512 1 "register_operand" "v"))
+ (match_operand:VF_512 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512F"
+{
+  static char buf[128];
+  const char *ops;
+  const char *suffix;
+
+  suffix = "";
+  ops = "";
+
+  /* There is no vandnp[sd] in avx512f.  Use vpandn[qd].  */
+  if (!TARGET_AVX512DQ)
+{
+  suffix = GET_MODE_INNER (mode) == DFmode ? "q" : "d";
+  ops = "p";
+}
+
+  snprintf (buf, sizeof (buf),
+   "v%sandn%s\t{%%2, %%1, %%0|%%0, 
%%1, %%2}",
+   ops, suffix);
+  return buf;
+}
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_expand "3"
   [(set (match_operand:VF_128_256 0 "register_operand")
-   (any_logic:VF_128_256
- (match_operand:VF_128_256 1 "nonimmediate_operand")
- (match_operand:VF_128_256 2 "nonimmediate_operand")))]
-  "TARGET_SSE"
+   (any_logic:VF_128_256
+ (match_operand:VF_128_256 1 "nonimmediate_operand")
+ (match_operand:VF_128_256 2 "nonimmediate_operand")))]
+  "TARGET_SSE && "
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
-(define_expand "3"
+(define_expand "3"
   [(set (match_operand:VF_512 0 "register_operand")
-   (fpint_logic:VF_512
+   (any_logic:VF_512
  (match_operand:VF_512 1 "nonimmediate_operand")
  (match_operand:VF_512 2 "nonimmediate_operand")))]
   "TARGET_AVX512F"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
-(define_insn "*3"
-  [(set (match_operand:VF 0 "register_operand" "=x,v")
-   (any_logic:VF
- (match_operand:VF 1 "nonimmediate_operand" "%0,v")
- (match_operand:VF 2 "nonimmediate_operand" "xm,vm")))]
-  "TARGET_SSE && ix86_binary_operator_ok (, mode, operands)"
+(define_insn "*3"
+  [(set (match_operand:VF_128_256 0 "register_operand" "=x,v")
+   (any_logic:VF_128_256
+ (match_operand:VF_128_256 1 "nonimmediate_operand" "%0,v")
+ (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,vm")))]
+  "TARGET_SSE && 
+   && ix86_binary_operator_ok (, mode, operands)"
 {
-  static char buf[32];
+  static char buf[128];
   const char *ops;
   const char *suffix;
 
@@ -2788,17 +2821,17 @@
   ops = "%s\t{%%2, %%0|%%0, %%2}";
   break;
 case 1:
-  ops = "v%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
+  ops =

Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-26 Thread Ilya Verbin
Hi,

The patch has been updated:

On 17 Sep 15:45, Jakub Jelinek wrote:
> Looks mostly ok, just some nits.  But see the patch I've just posted,
> perhaps we want to tweak the --enable-offload-targets arguments.  And

Now the targets in --enable-offload-targets can have optional path to the build
or install tree of the corresponding offload compiler.

> > --- /dev/null
> > +++ b/libgcc/ompstuff.c
> > @@ -0,0 +1,79 @@
> > +/* FIXME: Including auto-host is incorrect, but until we have
> > +   identified the set of defines that need to go into auto-target.h,
> > +   this will have to do.  */
> > +#include "auto-host.h"
> > +#undef pid_t
> > +#undef rlim_t
> > +#undef ssize_t
> > +#undef vfork
> 
> crtstuff.c undefs here also caddr_t, any reason not to do that too?

caddr_t was added to the crtstuff.c after ompstuff.c was created.  Fixed.

> > +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> > +void *_omp_func_table[0]
> > +  __attribute__ ((__used__, visibility ("hidden"),
> > + section (OFFLOAD_FUNC_TABLE_SECTION_NAME))) = { };
> > +void *_omp_var_table[0]
> > +  __attribute__ ((__used__, visibility ("hidden"),
> > + section (OFFLOAD_VAR_TABLE_SECTION_NAME))) = { };
> > +#endif
> 
> Does this mean that if HAVE_GAS_HIDDEN is not defined, you don't
> define _omp_*_table at all and offloading will fail?
> I wonder if it just should avoid visibility ("hidden") if it isn't
> supported.

Without visibility ("hidden") offloading works in case if there is only an
executable.  If some dso will register their _omp_func_table in libgomp,
offloading will not work, since _omp_func_table from the executable override
the respective symbols in dso.  So, if there are exec and dso with offloading,
but without visibility ("hidden"), I'd prefer to perform host fallback, as is
now, rather than crashing at run-time.

Also, previous patch contains a bug: if a compiler is configured as accelerator,
it installs *-accel-*-g++, and other drivers.  But only *-accel-*-gcc is needed.
Therefore I suppressed their installation in corresponding Makefiles.

The define OFFLOAD_LIBRARY is now removed from libgomp, since it is no longer
needed.

And libexecsubdir in lto-plugin/Makefile.in is tweaked for the possibility of
being configured as accelerator (like it was done in gcc/Makefile.in).
Otherwise offload compiler is unable to find its plugin.

Bootstrapped and regtested on i686-linux and x86_64-linux.
OK for trunk (after everything has been reviewed)?


2014-09-26  Bernd Schmidt  
Thomas Schwinge  
Ilya Verbin  
Andrey Turetskiy  

* configure: Regenerate.
* configure.ac (--enable-as-accelerator-for)
(--enable-offload-targets): New configure options.
gcc/
* Makefile.in (real_target_noncanonical, accel_dir_suffix)
(enable_as_accelerator): New variables substituted by configure.
(libsubdir, libexecsubdir, unlibsubdir): Tweak for the possibility of
being configured as an offload compiler.
(DRIVER_DEFINES): Pass new defines DEFAULT_REAL_TARGET_MACHINE and
ACCEL_DIR_SUFFIX.
(install-cpp, install-common, install_driver, install-gcc-ar): Do not
install for the offload compiler.
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac (real_target_noncanonical, accel_dir_suffix)
(enable_as_accelerator, enable_offload_targets): Compute new variables.
(--enable-as-accelerator-for, --enable-offload-targets): New options.
(ACCEL_COMPILER): Define if the compiler is built as the accel compiler.
(OFFLOAD_TARGETS): List of target names suitable for offloading.
(ENABLE_OFFLOADING): Define if list of offload targets is not empty.
gcc/cp/
* Make-lang.in (c++.install-common): Do not install for the offload
compiler.
gcc/fortran/
* Make-lang.in (fortran.install-common): Do not install for the offload
compiler.
libgcc/
* Makefile.in (crtompbegin$(objext), crtompend$(objext)): New rule.
* configure: Regenerate.
* configure.ac (--enable-as-accelerator-for)
(--enable-offload-targets): New configure options.
(extra_parts): Add crtompbegin.o and crtompend.o if
enable_offload_targets is not empty.
* ompstuff.c: New file.
libgomp/
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for libdl, required for plugin support.
(PLUGIN_SUPPORT): Define if plugins are supported.
(--enable-offload-targets): New configure option.
(enable_offload_targets): Support Intel MIC targets.
(OFFLOAD_TARGETS): List of target names suitable for offloading.
lto-plugin/
* Makefile.am (libexecsubdir): Tweak for the possibility of being
configured for offload compiler.
(accel_dir_suffix): New variable substituted by configure.
* Makefile.in: Regenerate.
* config

Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Richard Biener
On Fri, Sep 26, 2014 at 2:28 PM, Bernd Schmidt  wrote:
> On 09/26/2014 02:26 PM, Richard Biener wrote:
>>
>> On Fri, Sep 26, 2014 at 2:14 PM, Bernd Schmidt 
>> wrote:
>>>
>>> On 09/26/2014 02:05 PM, Richard Biener wrote:


 If currently address-space support matches up with the C frontend
 and the C standard then the middle-end has to cope with that.
 In this case, cope with array element types not having address-space
 qualifiers.
>>>
>>>
>>>
>>> That's the opposite of what happens. The C frontend makes array element
>>> types have address-space qualifiers but not the array type.
>>
>>
>> Ah, ok.  Then the opposite way around ;)
>
>
> Ok, so that means that my original patch which updated the element types for
> arrays is in fact the way to go?

It seems to do both, apply the as to the array _and_ the element type, no?

Thus for arrays you'd need to do (in that old patches terms)

  type = build_variant_type_copy (type);
  TREE_TYPE (type) = apply_as_to_type (TREE_TYPE (type), as);

and drop the build_qualified_type call for the array type itself.  Oh,
and instead of unconditionally doing that copy walk the existing
variant list to see if there is aready a properly qualified variant.

(it seems to me the apply_as_to_type function should first check
if 'type' already has the appropriate address-space qualification).

Richard.

>
> Bernd
>
>


Re: [GOMP4, RFC] OpenMP4 offload support for Intel PHI targets.

2014-09-26 Thread Ilya Verbin
Hi,

I also rebased and updated our branch:
https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/kyukhin/gomp4-offload

It contains fixes for the issues, mentioned in "Offloading not relocatable".

https://gcc.gnu.org/wiki/Offloading was updated accordingly.

  -- Ilya


Re: [PATCH 2/4] [AARCH64,NEON] Convert arm_neon.h to use new builtins for vld[234](q?)_lane_*

2014-09-26 Thread Tejas Belagod

On 26/09/14 02:16, Charles Baylis wrote:

On 19 September 2014 12:21, Tejas Belagod  wrote:

The reason we avoided using type-punning using unions was that reload would
get confused with potential subreg(mem) that could be introduced because of
memory xfer caused by unions and large int modes. As a result, we would get
incorrect or sub-optimal code. But this seems to have fixed itself. :-)

Because this involves xfers between large int modes and
CANNOT_CHANGE_MODE_CLASS has some impact on it, it would be good to test
what impact your patch has with C_C_M_C removed, so that it will be easier
to fix the fallout once we remove C_C_M_C eventually. To test this you will
need Richard's patch set
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01440.html.

Same for your other 2 patches in this series(3,4).


I tried those patches, and altered aarch64_cannot_change_mode_class to
return false for all cases.

However, this does not avoid the unnecessary moves.

Taking a really simple test case:

#include 

int32x2x2_t xvld2_s32(int32_t *__a)
{
   int32x2x2_t ret;
   __builtin_aarch64_simd_oi __o;
   __o = __builtin_aarch64_ld2v2si ((const __builtin_aarch64_simd_si *) __a);
   ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
   ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
   return ret;
}

(disabling scheduling for clarity)
$ aarch64-oe-linux-gcc -O2 -S -o - simd.c -fno-schedule-insns
-fno-schedule-insns2
 ...
xvld2_s32:
 ld2 {v2.2s - v3.2s}, [x0]
 orr v0.8b, v2.8b, v2.8b
 orr v1.8b, v3.8b, v3.8b
 ret
 ...


The reason is apparent in the rtl dump from ira:
...
   Allocno a0r73 of FP_REGS(32) has 31 avail. regs  33-63, node:
33-63 (confl regs =  0-32 64 65)
...
(insn 2 4 3 2 (set (reg/v/f:DI 79 [ __a ])
 (reg:DI 0 x0 [ __a ])) simd.c:5 34 {*movdi_aarch64}
  (expr_list:REG_DEAD (reg:DI 0 x0 [ __a ])
 (nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 20 2 (set (reg/v:OI 73 [ __o ])
 (subreg:OI (vec_concat:V8SI (vec_concat:V4SI (unspec:V2SI [
 (mem:TI (reg/v/f:DI 79 [ __a ]) [0  S16 A8])
 ] UNSPEC_LD2)
 (vec_duplicate:V2SI (const_int 0 [0])))
 (vec_concat:V4SI (unspec:V2SI [
 (mem:TI (reg/v/f:DI 79 [ __a ]) [0  S16 A8])
 ] UNSPEC_LD2)
 (vec_duplicate:V2SI (const_int 0 [0] 0))
simd.c:8 2149 {aarch64_ld2v2si_dreg}
  (expr_list:REG_DEAD (reg/v/f:DI 79 [ __a ])
 (nil)))
(insn 20 6 21 2 (set (reg:V2SI 32 v0)
 (subreg:V2SI (reg/v:OI 73 [ __o ]) 0)) simd.c:12 778
{*aarch64_simd_movv2si}
  (nil))
(insn 21 20 22 2 (set (reg:V2SI 33 v1)
 (subreg:V2SI (reg/v:OI 73 [ __o ]) 16)) simd.c:12 778
{*aarch64_simd_movv2si}
  (expr_list:REG_DEAD (reg/v:OI 73 [ __o ])
 (nil)))
(insn 22 21 23 2 (use (reg:V2SI 32 v0)) simd.c:12 -1
  (nil))
(insn 23 22 0 2 (use (reg:V2SI 33 v1)) simd.c:12 -1
  (nil))

The register allocator considers r73 to conflict with v0, because they
are simultaneously live after insn 20. Without the 2nd use of v73 (eg
if the write to res.val[1] is replaced with vdup_n_s32(0) ) then the
allocator does do the right thing with the subreg and allocates v73 to
{v0,v1}.

I haven't read all of the old threads relating to Richard's patches
yet, but I don't see why they would affect this issue.

I don't think the register allocator is able to resolve this unless
the conversion between the __builtin_simd type and the int32x4x2_t
type is done as a single operation.



For this piece of code,

#include "arm_neon.h"

int32x2x2_t xvld2_s32(int32_t *__a)
{
  union { int32x2x2_t __i;
 __builtin_aarch64_simd_oi __o; } __temp;
  __temp.__o = __builtin_aarch64_ld2v2si ((const 
__builtin_aarch64_simd_si *) __a);

  return __temp.__i;
}

int32x2x2_t yvld2_s32(int32_t *__a)
{
  int32x2x2_t ret;
  __builtin_aarch64_simd_oi __o;
  __o = __builtin_aarch64_ld2v2si ((const __builtin_aarch64_simd_si *) 
__a);

  ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
  ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
  return ret;
}

currently my gcc HEAD generates at -O3:

xvld2_s32:
ld2 {v0.2s - v1.2s}, [x0]
sub sp, sp, #64
st1 {v0.16b - v1.16b}, [sp]
ldr x1, [sp]
ldr x0, [sp, 8]
add sp, sp, 64
ins v0.d[0], x1
ins v1.d[0], x0
ret

yvld2_s32:
ld2 {v2.2s - v3.2s}, [x0]
orr v1.8b, v3.8b, v3.8b
orr v0.8b, v2.8b, v2.8b
ret

If we use type-punning, there are unnecessary spills that are generated 
which is also incorrect for BE because of of the way we spill (st1 
{v0.16b - v1.16b}, [sp]) and restore. The implementation without 
type-punning seems to give a more optimal result. Did your patches 
improve on the s

[PATCH] Avoid an unused stack frame for -mprofile-kernel profiling on leaf functions.

2014-09-26 Thread Anton Blanchard

gcc/:

2014-09-25  Anton Blanchard  

PR target/63354
* config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled): New function.
* config/rs6000/linux64.h (TARGET_KEEP_LEAF_WHEN_PROFILED): Define.
---
 gcc/config/rs6000/linux64.h | 3 +++
 gcc/config/rs6000/rs6000.c  | 9 +
 2 files changed, 12 insertions(+)

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 39a2b17..415345b 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -59,6 +59,9 @@ extern int dot_symbols;
 
 #define TARGET_PROFILE_KERNEL profile_kernel
 
+#undef TARGET_KEEP_LEAF_WHEN_PROFILED
+#define TARGET_KEEP_LEAF_WHEN_PROFILED rs6000_keep_leaf_when_profiled
+
 #define TARGET_USES_LINUX64_OPT 1
 #ifdef HAVE_LD_LARGE_TOC
 #undef TARGET_CMODEL
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 16847aa..4e33e7b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -23997,6 +23997,15 @@ rs6000_output_function_prologue (FILE *file,
   rs6000_pic_labelno++;
 }
 
+/* -mprofile-kernel code calls mcount before the function prolog,
+   so a profiled leaf function should stay a leaf function.  */
+
+static bool
+rs6000_keep_leaf_when_profiled ()
+{
+  return TARGET_PROFILE_KERNEL;
+}
+
 /* Non-zero if vmx regs are restored before the frame pop, zero if
we restore after the pop when possible.  */
 #define ALWAYS_RESTORE_ALTIVEC_BEFORE_POP 0
-- 
1.9.1



Re: Avoid privatization of TLS variables

2014-09-26 Thread Alan Modra
On Fri, Sep 26, 2014 at 04:17:14AM +0200, Jan Hubicka wrote:
> I was building libreoffice with profile feedback and I run into a message
> 
> cannot load any more object with static TLS
> 
> that took me a while to track as I did not see where static TLS is comming 
> out.
> Ian pointed out to me that static variables with TLS storage also consume
> static TLS even if they are in dynamic model.  This is why I disabled
> localization.  Is there better way to handle this?

Fix a glibc bug?  It has been a while since I looked into glibc in
any depth regarding TLS (2011-03), but I believe the l_tls_modid test
here
  if (! RTLD_SINGLE_THREAD_P && imap->l_tls_modid > DTV_SURPLUS)
_dl_signal_error (0, "dlopen", NULL, N_("\
cannot load any more object with static TLS"));

is wrong.  The test is saying "if we have loaded a certain number of
dynamic objects with TLS segments, refuse to dlopen any more
containing TLS if we are multi-threaded".

What it should be saying is "if we have loaded a certain number of
dynamic objects with TLS segments *after we went multi-threaded*,
refuse to open any more".  In particular, any dynamic objects with TLS
segments loaded at program startup should not be counted.  This is
because DTV_SURPLUS *extra* slots are allocated above those needed at
program startup.  At least, that's how I think it works.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Put all MAINTAINERS email addresses into <...>

2014-09-26 Thread Segher Boessenkool
On Thu, Sep 25, 2014 at 11:10:00PM +0200, Jan-Benedict Glaw wrote:
> Resending this email. Seems some spam filter ate it due to the many
> email addresses...

Will now all/most/many further patches to MAINTAINERS hit spam filters
as well?


Segher


Re: [Patch, AArch64] Enable Address sanitizer and UB sanitizer

2014-09-26 Thread Christophe Lyon
On 21 September 2014 20:07, Christophe Lyon  wrote:
> On 17 September 2014 12:48, Marcus Shawcroft  
> wrote:
>> On 9 September 2014 13:08, Christophe Lyon  
>> wrote:
>>> On 9 September 2014 12:03,   wrote:


> On Sep 9, 2014, at 2:50 AM, Marcus Shawcroft  
> wrote:
>
> +static unsigned HOST_WIDE_INT
> +aarch64_asan_shadow_offset (void)
> +{
> +  return (HOST_WIDE_INT_1 << 36);
> +}
> +
>
> Looking around various other ports I see magic numbers including 29,
> 41, 44 Help me understand why 36 is the right choice for aarch64?

 Also why 36?  What is the min virtual address space aarch64 Linux kernel 
 supports with 4k pages and 3 level page table?  Also does this need to 
 conditionalized on lp64?  Since I am about to post glibc patches turning 
 on address sanitizer breaks that.

>>>
>>> The address space is 2^39 according to /proc/self/maps:
>>> [...]
>>>
>>> The shadow offset is obtained by dividing this value by 8 -> 2^36.
>>>
>>> Note that this value has to match kAArch64_ShadowOffset64 as defined
>>> in libsanitizer/asan/asan_mapping.h.
>>>
>>> I do expect a followup patch to support ilp32, but I wouldn't post a
>>> patch which I haven't tested.
>>
>> Presumably for ILP32 the shadow offset should be 1<<29 and we will
>> need to make both asan_mapping.h and aarch64_asan_shadow_offset
>> conditional.
>>
> Indeed. We'll do that once Andrew has committed all his IPL32 patches (glibc).
>
>> This patch for LP64 is OK.
> I will commit it once the libsanitizer runtime has been updated to at
> least r209641 otherwise GCC will fail to build for AArch64.

Committed as r215642.

Christophe.


Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Rainer Orth
Hi Maxim,

> Thank you all for your help!
>
> Done in r215633.

unfortuntely, the applied patch cannot have been tested properly and
breaks native i386-pc-solaris2.11 (and every other) bootstrap:

/vol/gcc/src/hg/trunk/local/gcc/gcc.c: In function 'attempt_status 
run_attempt(const char**, const char*, const char*, int, int)':
/vol/gcc/src/hg/trunk/local/gcc/gcc.c:6319:15: error: variable 'errmsg' set but 
not used [-Werror=unused-but-set-variable]
   const char *errmsg;
   ^
/vol/gcc/src/hg/trunk/local/gcc/gcc.c: At global scope:
/vol/gcc/src/hg/trunk/local/gcc/gcc.c:6412:33: error: unused parameter 'prog' 
[-Werror=unused-parameter]
 try_generate_repro (const char *prog, const char **argv)
 ^
Removing the errmsg variable and the prog parameter name fixes this.
I'm not sure if the errmsg is intentionally unused, though.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Put all MAINTAINERS email addresses into <...>

2014-09-26 Thread Steven Bosscher
On Fri, Sep 26, 2014 at 3:09 PM, Segher Boessenkool wrote:
> On Thu, Sep 25, 2014 at 11:10:00PM +0200, Jan-Benedict Glaw wrote:
>> Resending this email. Seems some spam filter ate it due to the many
>> email addresses...
>
> Will now all/most/many further patches to MAINTAINERS hit spam filters
> as well?


Let's hope not. But at least for me, all mail to people @ arm.com now bounce...

Ciao!
Steven


Re: [PATCH] Put all MAINTAINERS email addresses into <...>

2014-09-26 Thread Trevor Saunders
On Fri, Sep 26, 2014 at 03:14:16PM +0200, Steven Bosscher wrote:
> On Fri, Sep 26, 2014 at 3:09 PM, Segher Boessenkool wrote:
> > On Thu, Sep 25, 2014 at 11:10:00PM +0200, Jan-Benedict Glaw wrote:
> >> Resending this email. Seems some spam filter ate it due to the many
> >> email addresses...
> >
> > Will now all/most/many further patches to MAINTAINERS hit spam filters
> > as well?
> 
> 
> Let's hope not. But at least for me, all mail to people @ arm.com now 
> bounce...

Isn't that caused by the disclaimer thingy at the bottom of there mails?
Stripping that out fixed the issue for me.

Trev
> 
> Ciao!
> Steven


Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Thomas Schwinge
Hi!

On Fri, 26 Sep 2014 12:04:45 +0400, Maxim Ostapenko 
 wrote:
> Thank you all for your help!
> 
> Done in r215633.
> 
> -Maxim
> On 09/25/2014 11:05 PM, Jeff Law wrote:
> > On 09/23/14 01:14, Maxim Ostapenko wrote:
> >>
> >>
> >> 2014-09-04  Jakub Jelinek
> >> Max Ostapenko
> >>
> >> * common.opt: New option.
> >> * doc/invoke.texi: Describe new option.
> >> * gcc.c (execute): Don't free first string early, but at the end
> >> of the function.  Call retry_ice if compiler exited with
> >> ICE_EXIT_CODE.
> >> (main): Factor out common code.
> >> (print_configuration): New function.
> >> (files_equal_p): Likewise.
> >> (check_repro): Likewise.
> >> (run_attempt): Likewise.
> >> (do_report_bug): Likewise.
> >> (append_text): Likewise.
> >> (try_generate_repro): Likewise
> > Approved.  Please install.
> >
> > Thanks for your patience,
> > Jeff

This is causing compiler warnings, respectively bootstrap errors:

[...]
../../master/gcc/gcc.c: In function 'attempt_status run_attempt(const 
char**, const char*, const char*, int, int)':
../../master/gcc/gcc.c:6319:15: error: variable 'errmsg' set but not used 
[-Werror=unused-but-set-variable]
   const char *errmsg;
   ^
../../master/gcc/gcc.c: At global scope:
../../master/gcc/gcc.c:6412:33: error: unused parameter 'prog' 
[-Werror=unused-parameter]
 try_generate_repro (const char *prog, const char **argv)
 ^
cc1plus: all warnings being treated as errors
Makefile:1040: recipe for target 'gcc.o' failed
make[3]: *** [gcc.o] Error 1
make[3]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc'
Makefile:4285: recipe for target 'all-stage2-gcc' failed
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
Makefile:21561: recipe for target 'stage2-bubble' failed
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
Makefile:892: recipe for target 'all' failed
make: *** [all] Error 2

OK to fix as follows?  Only compile-tested, did not test the new
-freport-bug functionality.

diff --git gcc/gcc.c gcc/gcc.c
index e32ff47..47c4e28 100644
--- gcc/gcc.c
+++ gcc/gcc.c
@@ -253,7 +253,7 @@ static void init_gcc_specs (struct obstack *, const char *, 
const char *,
 static const char *convert_filename (const char *, int, int);
 #endif
 
-static void try_generate_repro (const char *prog, const char **argv);
+static void try_generate_repro (const char **argv);
 static const char *getenv_spec_function (int, const char **);
 static const char *if_exists_spec_function (int, const char **);
 static const char *if_exists_else_spec_function (int, const char **);
@@ -2918,7 +2918,7 @@ execute (void)
&& i == 0
&& (p = strrchr (commands[0].argv[0], DIR_SEPARATOR))
&& ! strncmp (p + 1, "cc1", 3))
- try_generate_repro (commands[0].prog, commands[0].argv);
+ try_generate_repro (commands[0].argv);
if (WEXITSTATUS (status) > greatest_status)
  greatest_status = WEXITSTATUS (status);
ret_code = -1;
@@ -6332,6 +6332,16 @@ run_attempt (const char **new_argv, const char *out_temp,
   errmsg = pex_run (pex, pex_flags, new_argv[0],
CONST_CAST2 (char *const *, const char **, &new_argv[1]), 
out_temp,
err_temp, &err);
+  if (errmsg != NULL)
+{
+  if (err == 0)
+   fatal_error (errmsg);
+  else
+   {
+ errno = err;
+ pfatal_with_name (errmsg);
+   }
+}
 
   if (!pex_get_status (pex, 1, &exit_status))
 goto out;
@@ -6409,7 +6419,7 @@ append_text (char *file, const char *str)
and preprocessed source code.  */
 
 static void
-try_generate_repro (const char *prog, const char **argv)
+try_generate_repro (const char **argv)
 {
   int i, nargs, out_arg = -1, quiet = 0, attempt;
   const char **new_argv;



Grüße,
 Thomas


pgpLxySwqHI1_.pgp
Description: PGP signature


Re: [Patch AArch64 4/4] Wire up New target hooks

2014-09-26 Thread James Greenhalgh
On Thu, Sep 25, 2014 at 03:57:36PM +0100, James Greenhalgh wrote:
> 
> Hi,
> 
> This patch wires up our new target hooks for AArch64. This also means
> we can bring back the two failing SRA tests.
> 
> Bootstrapped on AArch64 with no issues.
> 
> OK for trunk?

No way! This patch is nonsense as it stands!

I'd like to withdraw this for now while I have a think about what
has gone wrong!

Thanks,
James

> 
> Thanks,
> James
> 
> ---
> gcc/
> 
> 2014-09-25  James Greenhalgh  
> 
>   * config/aarch64/aarch64.c
>   (aarch64_estimate_movmem_ninsns): New.
>   (aarch64_expand_movmem): Refactor old move costs.
>   (aarch64_move_by_pieces_profitable_p): New.
>   (aarch64_estimate_block_copy_ninsns): Likewise.
>   (aarch64_max_scalarization_size): Likewise.
>   (TARGET_MAX_SCALARIZATION_SIZE): Likewise.
>   (TARGET_MOVE_BY_PIECES_PROFITABLE_P): Likewise.
>   * config/aarch64/aarch64.h (AARCH64_MOVE_RATIO): New.
>   (MOVE_RATIO): Delete.
> 
> gcc/testsuite/
> 
> 2014-09-25  James Greenhalgh  
> 
>   * gcc.dg/tree-ssa/pr42585.c: Bring back for AArch64.
>   * gcc.dg/tree-ssa/sra-12.c: Likewise.
> 

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 3483081..d8b5a4a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9616,6 +9616,34 @@ aarch64_modes_tieable_p (enum machine_mode mode1, enum 
> machine_mode mode2)
>return false;
>  }
>  
> +static unsigned int
> +aarch64_estimate_movmem_ninsns (HOST_WIDE_INT size)
> +{
> +  HOST_WIDE_INT chunks = 0;
> +  int n = size;
> +
> +  /* 3 bytes is a 2-byte then a 1-byte copy.  */
> +  if (n == 3)
> +return 2;
> +
> +  /* 5, 6, 7 bytes need an extra copy.  */
> +  if (n > 4 && n < 8)
> +chunks++;
> +
> +  /* If n was greater than 8, it is dealt with in 8/16-byte chunks
> + first.  */
> +  chunks += n / 16;
> +  n %= 16;
> +  chunks += n / 8;
> +  n %= 8;
> +
> +  /* Anything left is dealt with in one instruction.  */
> +  if (n != 0)
> +chunks++;
> +
> +  return chunks;
> +}
> +
>  /* Return a new RTX holding the result of moving POINTER forward by
> AMOUNT bytes.  */
>  
> @@ -9673,7 +9701,7 @@ aarch64_expand_movmem (rtx *operands)
>  
>/* When optimizing for size, give a better estimate of the length of a
>   memcpy call, but use the default otherwise.  */
> -  unsigned int max_instructions = (speed_p ? 15 : AARCH64_CALL_RATIO) / 2;
> +  unsigned int max_instructions = AARCH64_MOVE_RATIO (speed_p);
>  
>/* We can't do anything smart if the amount to copy is not constant.  */
>if (!CONST_INT_P (operands[2]))
> @@ -9681,10 +9709,9 @@ aarch64_expand_movmem (rtx *operands)
>  
>n = UINTVAL (operands[2]);
>  
> -  /* Try to keep the number of instructions low.  For cases below 16 bytes we
> - need to make at most two moves.  For cases above 16 bytes it will be one
> - move for each 16 byte chunk, then at most two additional moves.  */
> -  if (((n / 16) + (n % 16 ? 2 : 0)) > max_instructions)
> +  /* Try to keep the number of instructions we emit low, fail expansion
> + if we are unable to and leave it to memcpy.  */
> +  if (aarch64_estimate_movmem_ninsns (n) > max_instructions)
>  return false;
>  
>base = copy_to_mode_reg (Pmode, XEXP (dst, 0));
> @@ -9774,6 +9801,57 @@ aarch64_expand_movmem (rtx *operands)
>return true;
>  }
>  
> +/* Implement TARGET_MOVE_BY_PIECES_PROFITABLE_P.  */
> +
> +bool
> +aarch64_move_by_pieces_profitable_p (unsigned int size
> +  unsigned int align
> +  bool speed_p)
> +{
> +  /* For strict alignment we don't want to use our unaligned
> + movmem implementation.  */
> +  if (STRICT_ALIGNMENT)
> +return (AARCH64_MOVE_RATIO (speed_p)
> + < move_by_pieces_ninsns (size, align, speed_p));
> +
> +  /* If we have an overhang of 3, 6 or 7 bytes, we would emit an unaligned
> + load to cover it, if this is likely to be slow we would do better
> + going through move_by_pieces.  */
> +  if (size % 8 > 5)
> +return SLOW_UNALIGNED_ACCESS (DImode, 1);
> +  else if (size % 4 == 3)
> +return SLOW_UNALIGNED_ACCESS (SImode, 1);
> +
> +  /* We can likely do a better job than the move_by_pieces infrastructure
> + can.  */
> +  return false;
> +}
> +
> +/* Implement TARGET_ESTIMATE_BLOCK_COPY_NINSNS.  */
> +
> +unsigned int
> +aarch64_estimate_block_copy_ninsns (HOST_WIDE_INT size, bool speed_p)
> +{
> +  if (aarch64_move_by_pieces_profitable_p (size, 8, speed_p))
> +return move_by_pieces_ninsns (size, 8, MOVE_MAX_PIECES);
> +  else if (aarch64_estimate_movmem_ninsns (size)
> +< AARCH64_MOVE_RATIO (speed_p))
> +return aarch64_estimate_movmem_ninsns (size);
> +  else
> +/* memcpy.  Set up 3 arguments and make a call.  */
> +return 4;
> +}
> +
> +/* Implement TARGET_MAX_SCALARIZATION_SIZE.  */
> +
> +unsigned int
> +aarch64_max_scalarization_size (bool speed_p)
> +{
> +  /* 

Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Maxim Ostapenko

Ugh, sorry. Thanks for fixing this.

On 09/26/2014 05:23 PM, Thomas Schwinge wrote:

Hi!

On Fri, 26 Sep 2014 12:04:45 +0400, Maxim Ostapenko 
 wrote:

Thank you all for your help!

Done in r215633.

-Maxim
On 09/25/2014 11:05 PM, Jeff Law wrote:

On 09/23/14 01:14, Maxim Ostapenko wrote:


2014-09-04  Jakub Jelinek
 Max Ostapenko

 * common.opt: New option.
 * doc/invoke.texi: Describe new option.
 * gcc.c (execute): Don't free first string early, but at the end
 of the function.  Call retry_ice if compiler exited with
 ICE_EXIT_CODE.
 (main): Factor out common code.
 (print_configuration): New function.
 (files_equal_p): Likewise.
 (check_repro): Likewise.
 (run_attempt): Likewise.
 (do_report_bug): Likewise.
 (append_text): Likewise.
 (try_generate_repro): Likewise

Approved.  Please install.

Thanks for your patience,
Jeff

This is causing compiler warnings, respectively bootstrap errors:

 [...]
 ../../master/gcc/gcc.c: In function 'attempt_status run_attempt(const 
char**, const char*, const char*, int, int)':
 ../../master/gcc/gcc.c:6319:15: error: variable 'errmsg' set but not used 
[-Werror=unused-but-set-variable]
const char *errmsg;
^
 ../../master/gcc/gcc.c: At global scope:
 ../../master/gcc/gcc.c:6412:33: error: unused parameter 'prog' 
[-Werror=unused-parameter]
  try_generate_repro (const char *prog, const char **argv)
  ^
 cc1plus: all warnings being treated as errors
 Makefile:1040: recipe for target 'gcc.o' failed
 make[3]: *** [gcc.o] Error 1
 make[3]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc'
 Makefile:4285: recipe for target 'all-stage2-gcc' failed
 make[2]: *** [all-stage2-gcc] Error 2
 make[2]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
 Makefile:21561: recipe for target 'stage2-bubble' failed
 make[1]: *** [stage2-bubble] Error 2
 make[1]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
 Makefile:892: recipe for target 'all' failed
 make: *** [all] Error 2

OK to fix as follows?  Only compile-tested, did not test the new
-freport-bug functionality.


This works fine on pr55843 and pr58987.


diff --git gcc/gcc.c gcc/gcc.c
index e32ff47..47c4e28 100644
--- gcc/gcc.c
+++ gcc/gcc.c
@@ -253,7 +253,7 @@ static void init_gcc_specs (struct obstack *, const char *, 
const char *,
  static const char *convert_filename (const char *, int, int);
  #endif
  
-static void try_generate_repro (const char *prog, const char **argv);

+static void try_generate_repro (const char **argv);
  static const char *getenv_spec_function (int, const char **);
  static const char *if_exists_spec_function (int, const char **);
  static const char *if_exists_else_spec_function (int, const char **);
@@ -2918,7 +2918,7 @@ execute (void)
&& i == 0
&& (p = strrchr (commands[0].argv[0], DIR_SEPARATOR))
&& ! strncmp (p + 1, "cc1", 3))
- try_generate_repro (commands[0].prog, commands[0].argv);
+ try_generate_repro (commands[0].argv);
if (WEXITSTATUS (status) > greatest_status)
  greatest_status = WEXITSTATUS (status);
ret_code = -1;
@@ -6332,6 +6332,16 @@ run_attempt (const char **new_argv, const char *out_temp,
errmsg = pex_run (pex, pex_flags, new_argv[0],
CONST_CAST2 (char *const *, const char **, &new_argv[1]), 
out_temp,
err_temp, &err);
+  if (errmsg != NULL)
+{
+  if (err == 0)
+   fatal_error (errmsg);
+  else
+   {
+ errno = err;
+ pfatal_with_name (errmsg);
+   }
+}
  
if (!pex_get_status (pex, 1, &exit_status))

  goto out;
@@ -6409,7 +6419,7 @@ append_text (char *file, const char *str)
 and preprocessed source code.  */
  
  static void

-try_generate_repro (const char *prog, const char **argv)
+try_generate_repro (const char **argv)
  {
int i, nargs, out_arg = -1, quiet = 0, attempt;
const char **new_argv;



Grüße,
  Thomas

-Maxim


Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2014-09-26 Thread Felix Yang
Hi Jeff,

Thanks for the suggestions. I updated the patch accordingly.

1. Both my employer(Huawei) and I have signed the copyright
assignments with FSF.
These assignments are already sent via post two days ago and
hopefully should reach FSF in one week.
Maybe it's OK to commit this patch now?

 2. I am not turning member loop_depth of struct equivalence into
short integer as GCC API such as bb_loop_depth
 returns a loop's depth as a 32-bit interger.

 3. I find it's kind of difficult to use the new type and
interfaces for list walking the init_insns list for this patch.
The type of init_insns list is rtx, not rtl_insn_list *. Seems
we need to change a lot in order to use the new interface.
Not clear about the reason why it is not adjusted when we are
transferring to the new interface.
Anyway, I think it's better to have another patch fix that issue. OK?

 4. This bug is only reproduceable with my local customized GCC
version. So I don't have a testcase then.

 5. This patch bootstrapped on x86_64-suse-linux and reg-tested,
There are no regressions with this patch.
 Regression test summary with or without the patch:

=== gcc Summary ===

# of expected passes107986
# of unexpected failures348
# of unexpected successes33
# of expected failures262
# of unsupported tests2089
/home/yangfei/gcc-devel/gcc-build/gcc/xgcc  version 5.0.0 20140924
(experimental) (GCC)
--
=== g++ Summary ===

# of expected passes87415
# of unexpected failures276
# of expected failures266
# of unsupported tests3203
/home/yangfei/gcc-devel/gcc-build/gcc/testsuite/g++/../../xg++
version 5.0.0 20140924 (experimental) (GCC)

--
=== libatomic Summary ===

# of expected passes54
=== libgomp tests ===


Running target unix

=== libgomp Summary ===

# of expected passes693
=== libitm tests ===


Running target unix

=== libitm Summary ===

# of expected passes26
# of expected failures3
# of unsupported tests1
=== libstdc++ tests ===


+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,13 @@
+2014-09-26  Felix Yang  
+Jeff Law  
+
+* ira.c (struct equivalence): Change member "is_arg_equivalence"
and "replace"
+into boolean bitfields; add new member "no_equiv" and "reserved".
+(no_equiv): Set no_equiv of struct equivalence if register is marked
+as having no known equivalence.
+(update_equiv_regs): Check all definitions for a multiple-set
+register to make sure that the RHS have the same value.
+
 2014-09-26  Martin Liska  

 * cgraph.c (cgraph_node::release_body): New argument keep_arguments
Index: gcc/ira.c
===
--- gcc/ira.c(revision 215640)
+++ gcc/ira.c(working copy)
@@ -2896,10 +2896,13 @@ struct equivalence
  to be present within the same loop (or in an inner loop).  */
   int loop_depth;
   /* Nonzero if this had a preexisting REG_EQUIV note.  */
-  int is_arg_equivalence;
+  unsigned char is_arg_equivalence : 1;
   /* Set when an attempt should be made to replace a register
  with the associated src_p entry.  */
-  char replace;
+  unsigned char replace : 1;
+  /* Set if this register has no known equivalence.  */
+  unsigned char no_equiv : 1;
+  unsigned char reserved : 5;
 };

 /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence
@@ -3247,6 +3250,7 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE
   if (!REG_P (reg))
 return;
   regno = REGNO (reg);
+  reg_equiv[regno].no_equiv = 1;
   list = reg_equiv[regno].init_insns;
   if (list == const0_rtx)
 return;
@@ -3258,7 +3262,7 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE
 return;
   ira_reg_equiv[regno].defined_p = false;
   ira_reg_equiv[regno].init_insns = NULL;
-  for (; list; list =  XEXP (list, 1))
+  for (; list; list = XEXP (list, 1))
 {
   rtx insn = XEXP (list, 0);
   remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX));
@@ -3373,7 +3377,7 @@ update_equiv_regs (void)

   /* If this insn contains more (or less) than a single SET,
  only mark all destinations as having no known equivalence.  */
-  if (set == 0)
+  if (set == NULL_RTX)
 {
   note_stores (PATTERN (insn), no_equiv, NULL);
   continue;
@@ -3467,16 +3471,48 @@ update_equiv_regs (void)
   if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST)
 note = NULL_RTX;

-  if (DF_REG_DEF_COUNT (regno) != 1
-  && (! note
+  if (DF_REG_DEF_COUNT (regno) != 1)
+{
+  rtx list;
+  bool equal_p = true;
+
+  /* If we have already processed this pseudo and determined it
+ can not have an equivalence, then honor that decision.  */
+  if (reg_equiv[regno].no_equiv)
+continue;
+
+  if (! note
 

[PATCH i386 AVX512] [62/n] Add vpmaddubsw,vdbpsadbw insn patterns.

2014-09-26 Thread Kirill Yukhin
Hello,
This patch introduces patterns for vpmaddubsw and vdbpsadbw
insn.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_c_enum "unspec"): Add UNSPEC_DBPSADBW, UNSPEC_PMADDUBSW512.
(define_insn "avx512bw_pmaddubsw512"): New.
(define_insn "avx512bw_dbpsadbw"):
Ditto.
--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9835234..601373b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -130,6 +130,8 @@
   UNSPEC_SHA256RNDS2
 
   ;; For AVX512BW support
+  UNSPEC_DBPSADBW
+  UNSPEC_PMADDUBSW512
   UNSPEC_PSHUFHW
   UNSPEC_PSHUFLW
   UNSPEC_CVTINT2MASK
@@ -13401,6 +13403,19 @@
(set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
+;; Unspec version for intrinsics.
+(define_insn "avx512bw_pmaddubsw512"
+  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
+  (unspec:VI2_AVX512VL
+[(match_operand: 1 "register_operand" "v")
+ (match_operand: 2 "nonimmediate_operand" "vm")]
+ UNSPEC_PMADDUBSW512))]
+   "TARGET_AVX512BW"
+   "vpmaddubsw\t{%2, %1, %0|%0, %1, %2}";
+  [(set_attr "type" "sseiadd")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "XI")])
+
 (define_insn "ssse3_pmaddubsw128"
   [(set (match_operand:V8HI 0 "register_operand" "=x,x")
(ss_plus:V8HI
@@ -18097,6 +18112,21 @@
[(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_insn "avx512bw_dbpsadbw"
+  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
+   (unspec:VI2_AVX512VL
+ [(match_operand: 1 "register_operand" "v")
+  (match_operand: 2 "nonimmediate_operand" "vm")
+  (match_operand:SI 3 "const_0_to_255_operand")]
+ UNSPEC_DBPSADBW))]
+   "TARGET_AVX512BW"
+  "vdbpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "avx")
+   (set_attr "type" "sselog1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
 (define_insn "clz2"
   [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
(clz:VI48_AVX512VL


Re: [PATCH C++] - SD-6 Implementation Part 2 - __has_include macro and C++ language feature macros.

2014-09-26 Thread Ed Smith-Rowland

On 09/25/2014 01:40 PM, Jason Merrill wrote:

On 09/01/2014 09:41 PM, Ed Smith-Rowland wrote:

+  cpp_define (pfile, "__cpp_attribute_deprecated=201309");


Don't we support attribute deprecated in C++11?

Jason



We support [[gnu::deprecated]] in C++11 bit not [[deprecated]] until C++14.
Ed



RE: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal compiler error)"

2014-09-26 Thread David Sherwood
Hi Vladimir,

Sorry this took so long. I have tidied up the patch as you suggested and fixed 
some
style issues. Hope this looks better now.

Thanks!
David.

2014-09-26  David Sherwood  

* ira-int.h (ira_allocno): Add "wmode" field.
* ira-build.c (create_insn_allocnos): Add new "parent" function
parameter.
* ira-conflicts.c (ira_build_conflicts): Add conflicts for registers
that cannot be accessed in wmode.


-Original Message-
From: David Sherwood [mailto:david.sherw...@arm.com] 
Sent: 08 September 2014 12:48
To: 'gcc-patches@gcc.gnu.org'
Cc: 'vmaka...@redhat.com'
Subject: RE: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o 
compile, (internal
compiler error)"

Hi Vladimir,

Sorry, I forgot to CC you on this as it's your code. It's my first attempt at
submitting patches to gcc so I'm still learning as I go!

Kind Regards,
David Sherwood.

-Original Message-
From: David Sherwood [mailto:david.sherw...@arm.com] 
Sent: 05 September 2014 15:52
To: 'gcc-patches@gcc.gnu.org'
Subject: Fix for "FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o 
compile, (internal
compiler error)"

Hi,

I have a potential fix for a gcc testsuite failure for aarch64 in big endian, 
i.e.

FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o compile, (internal 
compiler error)
FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_y_tst.o compile, (internal 
compiler error)

It is caused by the inappropriate choice of hard registers for paradoxical sub 
registers in
big endian mode, for example if register 0 is chosen for a paradoxical TI 
subreg on big
endian then we may end up attempting to reference register -1. Similarly, on 
little endian
we could end up going beyond the upper bounds of the register file too.

My fix involves adding particular constraints in IRA on the choice of register 
once paradoxical 
sub registers are encountered. However, Richard Sandiford also proposed an 
alternative
solution that involves not constraining registers in IRA, but rather making use 
of cost analysis
instead and letting LRA do the work. Not sure what your preference is 

Fix was tested on aarch64 on little and big endian with no regressions.

Regards,
David Sherwood.

2014-08-26  David Sherwood  

* ira-int.h (ira_allocno): Add "wmode" field.
* ira-build.c (create_insn_allocnos): Add new "parent" function
parameter.
* ira-conflicts.c (ira_build_conflicts): Add conflicts for registers
that cannot be accessed in wmode.


rb2369.patch
Description: Binary data


Re: [RFC][ARM]: Fix reload spill failure (PR 60617)

2014-09-26 Thread Christophe Lyon
Ramana,

On 7 July 2014 13:48, Venkataramanan Kumar
 wrote:
> Hi Ramana/Maxim,
>
>
> On 18 June 2014 16:05, Venkataramanan Kumar
>  wrote:
>> Hi Ramana,
>>
>> On 18 June 2014 15:29, Ramana Radhakrishnan  
>> wrote:
>>> On Mon, Jun 16, 2014 at 1:53 PM, Venkataramanan Kumar
>>>  wrote:
 Hi Maintainers,

 This patch fixes the PR 60617 that occurs when we turn on reload pass
 in thumb2 mode.

 It occurs for the pattern "*ior_scc_scc" that gets generated for the 3
 argument of the below function call.

 JIT:emitStoreInt32(dst,regT0m, (op1 == dst || op2 == dst)));


 (snip---)
 (insn 634 633 635 27 (parallel [
 (set (reg:SI 3 r3)
 (ior:SI (eq:SI (reg/v:SI 110 [ dst ]) <== This operand
 r5 is registers gets assigned
 (reg/v:SI 112 [ op2 ]))
 (eq:SI (reg/v:SI 110 [ dst ]) <== This operand
 (reg/v:SI 111 [ op1 ]
 (clobber (reg:CC 100 cc))
 ]) ../Source/JavaScriptCore/jit/JITArithmetic32_64.cpp:179 300
 {*ior_scc_scc
 (snip---)

 The issue here is that the above pattern demands 5 registers (LO_REGS).

 But when we are in reload, registers r0 is used for pointer to the
 class, r1 and r2 for first and second argument. r7 is used for stack
 pointer.

 So we are left with r3,r4,r5 and r6. But the above patterns needs five
 LO_REGS. Hence we get spill failure when processing the last register
 operand in that pattern,

 In ARM port,  TARGET_LIKELY_SPILLED_CLASS is defined for Thumb-1 and
 for thumb 2 mode there is mention of using LO_REG in the comment as
 below.

 "Care should be taken to avoid adding thumb-2 patterns that require
 many low registers"

 So conservative fix is not to allow this pattern for Thumb-2 mode.
>>>
>>> I don't have an additional solution off the top of my head and
>>> probably need to go do some digging.
>>>
>>> It sounds like the conservative fix but what's the impact of doing so
>>> ? Have you measured that in terms of performance or code size on a
>>> range of benchmarks ?
>>>

>>
>> I haven't done any benchmark testing. I will try and run some
>> benchmarks with my patch.
>>
>>
 I allowed these pattern for Thumb2 when we have constant operands for
 comparison. That makes the target tests arm/thum2-cond-cmp-1.c to
 thum2-cond-cmp-4.c pass.
>>>
>>> That sounds fine and fair - no trouble there.
>>>
>>> My concern is with removing the register alternatives and loosing the
>>> ability to trigger conditional compares on 4.9 and trunk for Thumb1
>>> till the time the "new" conditional compare work makes it in.
>>>
>>> Ramana
>
> I tested this conservative fix with Coremark (ran it on chromebook)and
> SPEC cpu2006 (cross compiled and compared assembly differences).
>
> With Coremark there are no performance issues. In fact there no
> assembly differences with CPU flags for A15 and A9.
>
> For SPEC2006 I cross compiled and compared assembly differences with
> and without patch (-O3 -fno-common).
> I have not run these bechmarks.
>
> There are major code differences and are due to conditional compare
> instructions not getting generated as you expected. This also results
> in different physical register numbers assigned in the generated code
> and also there are code scheduling differences when comparing it with
> base.
>
>
> I am showing a simple test case to showcase the conditional compare
> difference I am seeing in SPEC2006 benchmarks.
>
> char a,b;
> int i;
> int f( int j)
> {
>   if ( (i >= a) ? (j <= a) : 1)
> return 1;
>   else
> return 0;
> }
>
> GCC FSF 4.9
> ---
>
> movwr2, #:lower16:a
> movwr3, #:lower16:i
> movtr2, #:upper16:a
> movtr3, #:upper16:i
> ldrbr2, [r2]@ zero_extendqisi2
> ldr r3, [r3]
> cmp r2, r3
> it  le
> cmple   r2, r0  <== conditional compare instrucion generated.
> ite ge
> movge   r0, #1
> movlt   r0, #0
> bx  lr
>
>
> With patch
> -
>
> movwr2, #:lower16:a
> movwr3, #:lower16:i
> movtr2, #:upper16:a
> movtr3, #:upper16:i
> ldr r3, [r3]
> ldrbr2, [r2]@ zero_extendqisi2
> cmp r2, r3
> ite le
> movle   r3, #0 <== Unoptimal moves generated.
> movgt   r3, #1 <==
> cmp r2, r0
> ite lt
> movlt   r0, r3
> orrge   r0, r3, #1<==
> bx  lr
>
> The following benchmarks have maximum number of conditional compare
> pattern differences and also code scheduling changes/different
> physical register numbers in generated code.
> 416.gamess/
> 434.zeusmp/
> 400.perlbench
> 403.gcc/
> 445.gobmk
> 483.xalancbmk/
> 401.bzip2
> 433.milc/

[jit] Eliminate fixed-size buffer for a context's first error message

2014-09-26 Thread David Malcolm
On Wed, 2014-09-24 at 22:04 -0600, Jeff Law wrote:
On 09/24/14 14:24, Joseph S. Myers wrote:
> > On Wed, 24 Sep 2014, David Malcolm wrote:
> >
> >> The ideal I'm aiming for here is that a well-behaved library should
> >> never abort, so I've rewritten these functions to use vasprintf, and
> >> added error-handling checks to cover the case where malloc returns NULL
> >> within vasprintf.
> >
> > GCC is designed on the basis of aborting on allocation failures - as is
> > GMP, which allows custom allocation functions to be specified but still
> > requires them to exit the program rather than return, longjmp or throw an
> > exception.
> But that may be something we need to change if GCC is going to be used 
> at a JIT in the future.  Yea, we'll still have problems of this nature 
> in libraries that GCC itself might use such as gmp, but that doesn't 
> mean that we have to perpetuate that practice in GCC itself.
> >>
> >> Presumably I should address this in a followup, by making that be
> >> dynamically-allocated?
> >
> > Yes.  Arbitrary limits should be avoided in GNU.
> Agreed.

Fixed in the following; committed to branch dmalcolm/jit:

This removes the truncation of overlong error messages in
  gcc::jit::recording::context::add_error_va
ensuring that API entrypoint gcc_jit_context_get_first_error reports
them without truncation.

gcc/jit/ChangeLog.jit:
* internal-api.h (gcc::jit::recording::context): Convert field
"m_first_error_str" from a fixed-size buffer to a pointer, and add
a field "m_owns_first_error_str" to determine if we're responsible
for freeing it.
* internal-api.c (gcc::jit::recording::context::context): Update
initializations in ctor for above change.
(gcc::jit::recording::context::~context): Free m_first_error_str
if we own it.
(gcc::jit::recording::context::add_error_va): When capturing the
first error message on a context, rather than copying "errmsg" to
a fixed-size buffer and truncating if oversize, simply store the
pointer to the error message, and flag whether we need to free it.
(gcc::jit::recording::context::get_first_error): Update for change
of "m_first_error_str" from an internal buffer to a pointer.
---
 gcc/jit/ChangeLog.jit  | 17 +
 gcc/jit/internal-api.c | 32 
 gcc/jit/internal-api.h |  4 +++-
 3 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 9cbba20..ac8f28d 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,20 @@
+2014-09-26  David Malcolm  
+
+   * internal-api.h (gcc::jit::recording::context): Convert field
+   "m_first_error_str" from a fixed-size buffer to a pointer, and add
+   a field "m_owns_first_error_str" to determine if we're responsible
+   for freeing it.
+   * internal-api.c (gcc::jit::recording::context::context): Update
+   initializations in ctor for above change.
+   (gcc::jit::recording::context::~context): Free m_first_error_str
+   if we own it.
+   (gcc::jit::recording::context::add_error_va): When capturing the
+   first error message on a context, rather than copying "errmsg" to
+   a fixed-size buffer and truncating if oversize, simply store the
+   pointer to the error message, and flag whether we need to free it.
+   (gcc::jit::recording::context::get_first_error): Update for change
+   of "m_first_error_str" from an internal buffer to a pointer.
+
 2014-09-25  David Malcolm  
 
* internal-api.c (gcc::jit::playback::context::compile): Use
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 05ef544..8ef9af9 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -188,14 +188,14 @@ recording::playback_block (recording::block *b)
 recording::context::context (context *parent_ctxt)
   : m_parent_ctxt (parent_ctxt),
 m_error_count (0),
+m_first_error_str (NULL),
+m_owns_first_error_str (false),
 m_mementos (),
 m_compound_types (),
 m_functions (),
 m_FILE_type (NULL),
 m_builtins_manager(NULL)
 {
-  m_first_error_str[0] = '\0';
-
   if (parent_ctxt)
 {
   /* Inherit options from parent.
@@ -234,6 +234,9 @@ recording::context::~context ()
 
   if (m_builtins_manager)
 delete m_builtins_manager;
+
+  if (m_owns_first_error_str)
+free (m_first_error_str);
 }
 
 /* Add the given mememto to the list of those tracked by this
@@ -901,12 +904,19 @@ recording::context::add_error_va (location *loc, const 
char *fmt, va_list ap)
 {
   char *malloced_msg;
   const char *errmsg;
+  bool has_ownership;
 
   vasprintf (&malloced_msg, fmt, ap);
   if (malloced_msg)
-errmsg = malloced_msg;
+{
+  errmsg = malloced_msg;
+  has_ownership = true;
+}
   else
-errmsg = "out of memory generating error message";
+{
+  errmsg = "out of memory generating error mess

Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Markus Trippelsdorf
On 2014.09.26 at 14:20 +0200, Martin Liška wrote:
> After couple of weeks I spent with fixing new issues connected to the
> pass: 1) Inliner failed in case I created a thunk and release body of
> a function. In such situation we need to preserve DECL_ARGUMENTS. I
> added new argument for: cgraph_node::release_body.  2) Awkward error
> was hidden in libstdc++ test for trees, there were two functions
> having one argument that differs in one sub-template. Thank to Richard
> who helped me to fix alias set accuracy.  3) There was missing
> comparison for FIELD_DECLS (DECL_FIELD_BIT_OFFSET) which caused me
> miscompilation.  4) After discussion with Honza, we introduced new
> cgraph_node flag called icf_merged. The flag helps to fix verifier in
> cgraph_node::verify.
> 
> Current version of the patch can bootstrap on x86_64-linux. With
> following patch applied, there's not testcase regression.  I tried to
> build Firefox, Inkscape, GIMP and Chromium with LTO and patch applied
> and no regression has been observed.

While a plain Firefox -flto build works fine. LTO/PGO build fails with:

lto1: internal compiler error: in ipa_merge_profiles, at ipa-utils.c:540
0x7d6165 ipa_merge_profiles(cgraph_node*, cgraph_node*)
../../gcc/gcc/ipa-utils.c:540
0xf10c41 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
../../gcc/gcc/ipa-icf.c:753
0xf15206 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
../../gcc/gcc/ipa-icf.c:2706
0xf1c1f4 ipa_icf::sem_item_optimizer::execute()
../../gcc/gcc/ipa-icf.c:2098
0xf1d3f1 ipa_icf_driver
../../gcc/gcc/ipa-icf.c:2784
0xf1d3f1 ipa_icf::pass_ipa_icf::execute(function*)
../../gcc/gcc/ipa-icf.c:2831


The pass is also very memory hungry (from 3GB without ICF to 4GB during
libxul link), while the code size savings are in the 1% range.

-- 
Markus


Re: [shrink-wrap] should not sink instructions which may cause trap ?

2014-09-26 Thread Jiong Wang


On 26/09/14 09:36, Richard Biener wrote:

On Fri, Sep 26, 2014 at 12:49 AM, Jiong Wang
 wrote:

2014-09-25 14:07 GMT+01:00 Jiong Wang :

On 25/09/14 12:25, Christophe Lyon wrote:

I have observed regressions in the g++ testsuite: pr49847 now FAILs
after this patch.

no.

even without my patch, the regression still happen.

or you could specify -fno-shrink-wrap, gcc still crash.

so, this regression should caused by other commits which haven't exposed on
x86 regression test.

sorry, confirmed, there is regression.

my code was git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215590.
there also be gcc crash on aarch64, with the following info,
   pr49847.C:5:21: internal compiler error: Segmentation fault
  try { return g >= 0; }
  ^
   0xdc249e crash_signal
   ../../gcc/gcc/toplev.c:340
   0xdbfeff default_get_reg_raw_mode(int)

so I was thinking it's caused by other commits instead of this, and after I sync
code to git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215599 I could
reproduce this bug.

   error: missing REG_EH_REGION note at the end of bb 2

the reson is:
   * before this patch, we only sink simple "set reg, reg" instruction which
 the corresponding instruction will not produce exception, thus no
REG_EH_REGION attached.
   * after this patch, we will sink instruction like the following for
aarch64 or arm or other RISC.

 (insn 7 3 30 2 (set (reg:CCFPE 66 cc)
 (compare:CCFPE (reg:SF 32 v0 [ g ])
 (const_double:SF 0.0 [0x0.0p+0]))) pr49847.C:5 330 {*cmpesf}
  (expr_list:REG_DEAD (reg:SF 32 v0 [ g ])
 (expr_list:REG_EH_REGION (const_int 1 [0x1])
 (nil

   "compare" is actually a operator which may cause trap and we need to prevent
   any instruction which may causing trap be sink, because that may
break exception handling logic

   so something like the following should be added to the iterator check

   if (may_trap_p (x))
 don't sink this instruction.

any comments?

Should be checking if x may throw internally instead.


Richard, thanks for the suggestion, have used insn_could_throw_p to do the 
check,
which will only do the check when flag_exception and flag_non_call_exception be 
true,
so those instruction could still be sink for normal c/c++ program.

Jeff,

  below is the fix for pr49847.C regression on aarch64. I re-run full test on
  aarch64-none-elf bare metal, no regression.

  bootstrap ok on x86, no regression on check-gcc/g++.

  ok for trunk?

  -- Jiong



Richard.


I will try to send a fix tomorrow.

thanks.

-- Jiong



-- Jiong



Here is what I have in my logs:

/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/gcc/testsuite/g++/../../xg++

-B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/gcc/testsuite/g++/../../
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C
-fno-diagnostics-show-caret -fdiagnostics-color=never  -nostdinc++

-I/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/arm-none-linux-gnueabihf/libstdc++-v3/include/arm-none-linux-gnueabihf

-I/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/arm-none-linux-gnueabihf/libstdc++-v3/include
-I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
-I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/include/backward
-I/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/util
-fmessage-length=0  -std=gnu++98 -O -fnon-call-exceptions  -S -o
pr49847.s(timeout = 800)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C: In
function 'int f(float)':
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C:7:1:
error: missing REG_EH_REGION note at the end of bb 2
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/g++.dg/pr49847.C:7:1:
internal compiler error: verify_flow_info failed
0x82f8ba verify_flow_info()
  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfghooks.c:260

0x840cd3 commit_edge_insertions()
  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgrtl.c:2068
0x9bf243 thread_prologue_and_epilogue_insns
  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5852
0x9bfa52 rest_of_handle_thread_prologue_and_epilogue
  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:6245
0x9bfa52 execute
  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:6283

As per

http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215563/report-build-info.html
I've noticed this on targets:
arm-none-linux-gnueabihf
armeb-none-linux-gnueabihf
aarch64-none-elf
aarch64_be-none-elf
aarch64-none-linux-gnu
but NOT on
arm-none-eabi
arm-none-linux-gnueabi

Christophe.



diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index bd4813c..2e2f0a6 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -189,6 +189,9 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn,
   unsigned int nonconstobj_num = 0;
   rtx src_inner = NULL_RTX;

+  if (insn_could_throw_p (insn

Re: [shrink-wrap] should not sink instructions which may cause trap ?

2014-09-26 Thread Jiong Wang

On 26/09/14 15:45, Jiong Wang wrote:

On 26/09/14 09:36, Richard Biener wrote:

On Fri, Sep 26, 2014 at 12:49 AM, Jiong Wang
 wrote:

2014-09-25 14:07 GMT+01:00 Jiong Wang :

On 25/09/14 12:25, Christophe Lyon wrote:

I have observed regressions in the g++ testsuite: pr49847 now FAILs
after this patch.

no.

even without my patch, the regression still happen.

or you could specify -fno-shrink-wrap, gcc still crash.

so, this regression should caused by other commits which haven't exposed on
x86 regression test.

sorry, confirmed, there is regression.

my code was git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215590.
there also be gcc crash on aarch64, with the following info,
pr49847.C:5:21: internal compiler error: Segmentation fault
   try { return g >= 0; }
   ^
0xdc249e crash_signal
../../gcc/gcc/toplev.c:340
0xdbfeff default_get_reg_raw_mode(int)

so I was thinking it's caused by other commits instead of this, and after I sync
code to git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215599 I could
reproduce this bug.

error: missing REG_EH_REGION note at the end of bb 2

the reson is:
* before this patch, we only sink simple "set reg, reg" instruction which
  the corresponding instruction will not produce exception, thus no
REG_EH_REGION attached.
* after this patch, we will sink instruction like the following for
aarch64 or arm or other RISC.

  (insn 7 3 30 2 (set (reg:CCFPE 66 cc)
  (compare:CCFPE (reg:SF 32 v0 [ g ])
  (const_double:SF 0.0 [0x0.0p+0]))) pr49847.C:5 330 {*cmpesf}
   (expr_list:REG_DEAD (reg:SF 32 v0 [ g ])
  (expr_list:REG_EH_REGION (const_int 1 [0x1])
  (nil

"compare" is actually a operator which may cause trap and we need to prevent
any instruction which may causing trap be sink, because that may
break exception handling logic

so something like the following should be added to the iterator check

if (may_trap_p (x))
  don't sink this instruction.

 any comments?

Should be checking if x may throw internally instead.

Richard, thanks for the suggestion, have used insn_could_throw_p to do the 
check,
which will only do the check when flag_exception and flag_non_call_exception be 
true,
so those instruction could still be sink for normal c/c++ program.

Jeff,

below is the fix for pr49847.C regression on aarch64. I re-run full test on
aarch64-none-elf bare metal, no regression.

bootstrap ok on x86, no regression on check-gcc/g++.

ok for trunk?


(re-sent with changelog entry)

gcc/

2014-09-26  Jiong Wang

* shrink-wrap.c (move_insn_for_shrink_wrap): Check "insn_could_throw_p" 
before
sinking insn.
diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index bd4813c..2e2f0a6 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -189,6 +189,9 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn,
   unsigned int nonconstobj_num = 0;
   rtx src_inner = NULL_RTX;

+  if (insn_could_throw_p (insn))
+	return false;
+
   subrtx_var_iterator::array_type array;
   FOR_EACH_SUBRTX_VAR (iter, array, src, ALL)
 	{

Re: [COMMITTED][PATCH, 2/2] shrink wrap a function with a single loop: split live_edge

2014-09-26 Thread H.J. Lu
On Thu, Sep 25, 2014 at 9:43 AM, Jiong Wang  wrote:
>
> On 25/09/14 17:24, Jeff Law wrote:
>>
>> On 09/25/14 09:04, Jiong Wang wrote:
>>>
>>> new patch updated.
>>>
>>> pass bootstrap and no regression, both check-gcc and check-g++, on the
>>> x86.
>>>
>>> OK for trunk?
>>>
>>> thanks.
>>>
>>> gcc/
>>>  * shrink-wrap.c (move_insn_for_shrink_wrap): Initialize the live-in
>>> of
>>>  new created BB as the intersection of live-in from "old_dest" and
>>> live-out
>>>  from "bb".
>>
>> Please include a ChangeLog entry for the testsuite.  Something like:
>>
>> * gcc.target/i386/shrink_wrap_1.c: New test.

This fails on Linux/x86 and with -m32 on Linux/x86-64.

>> With that addition, OK for the trunk.
>
>
> committed as r215611.



-- 
H.J.


Re: [COMMITTED][PATCH, 2/2] shrink wrap a function with a single loop: split live_edge

2014-09-26 Thread Jiong Wang


On 26/09/14 16:05, H.J. Lu wrote:

On Thu, Sep 25, 2014 at 9:43 AM, Jiong Wang  wrote:

On 25/09/14 17:24, Jeff Law wrote:

On 09/25/14 09:04, Jiong Wang wrote:

new patch updated.

pass bootstrap and no regression, both check-gcc and check-g++, on the
x86.

OK for trunk?

thanks.

gcc/
  * shrink-wrap.c (move_insn_for_shrink_wrap): Initialize the live-in
of
  new created BB as the intersection of live-in from "old_dest" and
live-out
  from "bb".

Please include a ChangeLog entry for the testsuite.  Something like:

 * gcc.target/i386/shrink_wrap_1.c: New test.

This fails on Linux/x86 and with -m32 on Linux/x86-64.
sorry, my test machine is x86-64, I think the shrink wrap test itself is 
very fragile because

it's highly related insn generated.

could you mark that testcase using something like 
"dg-require-effective-target lp64"?





With that addition, OK for the trunk.


committed as r215611.








Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Bernd Schmidt

On 09/26/2014 02:42 PM, Richard Biener wrote:

On Fri, Sep 26, 2014 at 2:28 PM, Bernd Schmidt  wrote:

On 09/26/2014 02:26 PM, Richard Biener wrote:


On Fri, Sep 26, 2014 at 2:14 PM, Bernd Schmidt 
wrote:


On 09/26/2014 02:05 PM, Richard Biener wrote:



If currently address-space support matches up with the C frontend
and the C standard then the middle-end has to cope with that.
In this case, cope with array element types not having address-space
qualifiers.




That's the opposite of what happens. The C frontend makes array element
types have address-space qualifiers but not the array type.



Ah, ok.  Then the opposite way around ;)



Ok, so that means that my original patch which updated the element types for
arrays is in fact the way to go?


It seems to do both, apply the as to the array _and_ the element type, no?


Yes. I guess I could not do this, but then the patch will also have to 
replace all but very few uses of TYPE_ADDR_SPACE outside the C frontend 
with a new addr_space_for_type function that checks for arrays.


I can do that, but to me it feels like utterly the wrong way to go. If 
you're sure that's what you want, I'll make a patch.



Bernd




[jit] Add a test of using very long names

2014-09-26 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/testsuite/ChangeLog.jit:
* jit.dg/test-long-names.c: New test case.
* jit.dg/all-non-failing-tests.h: Add test-long-names.c
* jit.dg/test-combination.c (create_code): Likewise.
(verify_code): Likewise.
* jit.dg/test-threads.c (testcases): Likewise.
---
 gcc/testsuite/jit.dg/all-non-failing-tests.h |   7 ++
 gcc/testsuite/jit.dg/test-combination.c  |   2 +
 gcc/testsuite/jit.dg/test-long-names.c   | 112 +++
 gcc/testsuite/jit.dg/test-threads.c  |   3 +
 4 files changed, 124 insertions(+)
 create mode 100644 gcc/testsuite/jit.dg/test-long-names.c

diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h 
b/gcc/testsuite/jit.dg/all-non-failing-tests.h
index 5f7b2ec..10d7199 100644
--- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
+++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
@@ -102,6 +102,13 @@
 #undef create_code
 #undef verify_code
 
+/* test-long-names.c */
+#define create_code create_code_long_names
+#define verify_code verify_code_long_names
+#include "test-long-names.c"
+#undef create_code
+#undef verify_code
+
 /* test-quadratic.c */
 #define create_code create_code_quadratic
 #define verify_code verify_code_quadratic
diff --git a/gcc/testsuite/jit.dg/test-combination.c 
b/gcc/testsuite/jit.dg/test-combination.c
index 9d3a535..06ba902 100644
--- a/gcc/testsuite/jit.dg/test-combination.c
+++ b/gcc/testsuite/jit.dg/test-combination.c
@@ -28,6 +28,7 @@ create_code (gcc_jit_context *ctxt, void * user_data)
   create_code_functions (ctxt, user_data);
   create_code_hello_world (ctxt, user_data);
   create_code_linked_list (ctxt, user_data);
+  create_code_long_names (ctxt, user_data);
   create_code_quadratic (ctxt, user_data);
   create_code_nested_loop (ctxt, user_data);
   create_code_reading_struct  (ctxt, user_data);
@@ -54,6 +55,7 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
   verify_code_functions (ctxt, result);
   verify_code_hello_world (ctxt, result);
   verify_code_linked_list (ctxt, result);
+  verify_code_long_names (ctxt, result);
   verify_code_quadratic (ctxt, result);
   verify_code_nested_loop (ctxt, result);
   verify_code_reading_struct (ctxt, result);
diff --git a/gcc/testsuite/jit.dg/test-long-names.c 
b/gcc/testsuite/jit.dg/test-long-names.c
new file mode 100644
index 000..0fc7e67
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-long-names.c
@@ -0,0 +1,112 @@
+/* Test of using the API with very long names.  */
+
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+/* 65KB */
+#define NAME_LENGTH (65 * 1024)
+
+static struct long_names
+{
+  char struct_name[NAME_LENGTH];
+  char fn_name[NAME_LENGTH];
+  char local_name[NAME_LENGTH];
+  char block_name[NAME_LENGTH];
+} long_names;
+
+static void
+populate_name (const char *prefix, char *buffer)
+{
+  int i;
+
+  /* Begin with the given prefix: */
+  sprintf (buffer, prefix);
+
+  /* Populate the rest of the buffer with 0123456789 repeatedly: */
+  for (i = strlen (prefix); i < NAME_LENGTH - 1; i++)
+buffer[i] = '0' + (i % 10);
+
+  /* NIL-terminate the buffer: */
+  buffer[NAME_LENGTH - 1] = '\0';
+}
+
+static void
+populate_names (void)
+{
+  populate_name ("struct_", long_names.struct_name);
+  populate_name ("test_fn_", long_names.fn_name);
+  populate_name ("local_", long_names.local_name);
+  populate_name ("block_", long_names.block_name);
+}
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  /* Where "ETC" is a very long suffix, let's try to inject the
+ equivalent of:
+
+   struct struct_ETC;
+
+   int
+   test_fn_ETC ()
+   {
+ int local_ETC;
+ local_ETC = 42;
+ return local_ETC;
+   }
+
+ to verify that the API copes with such long names.  */
+
+  populate_names ();
+
+  gcc_jit_type *int_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+
+  /* We don't yet use this struct.  */
+  (void)gcc_jit_context_new_opaque_struct (ctxt, NULL,
+  long_names.struct_name);
+
+  gcc_jit_function *test_fn =
+gcc_jit_context_new_function (ctxt, NULL,
+ GCC_JIT_FUNCTION_EXPORTED,
+ int_type,
+ long_names.fn_name,
+ 0, NULL,
+ 0);
+  gcc_jit_lvalue *local =
+gcc_jit_function_new_local (test_fn,
+   NULL,
+   int_type,
+   long_names.local_name);
+
+  gcc_jit_block *block =
+gcc_jit_function_new_block (test_fn, long_names.block_name);
+
+  gcc_jit_block_add_assignment (
+block,
+NULL,
+local,
+gcc_jit_context_new_rvalue_from_int (ctxt, int_type, 42));
+
+  gcc_jit_block_end_with_return (
+block, NULL,
+gcc_jit_lvalue_as_rvalue (local));
+}
+
+void
+verify_code (gcc_jit_context *

Re: [PATCH][match-and-simplify] Enable conversions properly for GENERIC

2014-09-26 Thread Jason Merrill

On 09/26/2014 06:54 AM, Richard Biener wrote:

It also uncovers that the C++ FE uses a mix of NOP_EXPR and
CONVERT_EXPR both when building expressions and when checking
for them.  Jason - is there any difference between NOP_EXPR and
CONVERT_EXPR as far as the C++ FE is concerned?


There are a few cases where CONVERT_EXPR is special, but mostly no.


I have
silenced -Wsign-compare warnings that the patch caused by
making enum_cast_to_int "accept" both NOP_EXPR and CONVERT_EXPR
as conversion code (ok for trunk?).


OK.

Jason




Re: [COMMITTED][PATCH, 2/2] shrink wrap a function with a single loop: split live_edge

2014-09-26 Thread H.J. Lu
On Fri, Sep 26, 2014 at 8:14 AM, Jiong Wang  wrote:
>
> On 26/09/14 16:05, H.J. Lu wrote:
>>
>> On Thu, Sep 25, 2014 at 9:43 AM, Jiong Wang  wrote:
>>>
>>> On 25/09/14 17:24, Jeff Law wrote:

 On 09/25/14 09:04, Jiong Wang wrote:
>
> new patch updated.
>
> pass bootstrap and no regression, both check-gcc and check-g++, on the
> x86.
>
> OK for trunk?
>
> thanks.
>
> gcc/
>   * shrink-wrap.c (move_insn_for_shrink_wrap): Initialize the
> live-in
> of
>   new created BB as the intersection of live-in from "old_dest" and
> live-out
>   from "bb".

 Please include a ChangeLog entry for the testsuite.  Something like:

  * gcc.target/i386/shrink_wrap_1.c: New test.
>>
>> This fails on Linux/x86 and with -m32 on Linux/x86-64.
>
> sorry, my test machine is x86-64, I think the shrink wrap test itself is
> very fragile because
> it's highly related insn generated.
>
> could you mark that testcase using something like
> "dg-require-effective-target lp64"?

I checked in this patch to skip it on ia32.


-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 215644)
+++ ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2014-09-26  H.J. Lu  
+
+ * gcc.target/i386/shrink_wrap_1.c: Skip ia32.
+
 2014-09-26  Jakub Jelinek  

  * g++.dg/compat/struct-layout-1_generate.c: Add -Wno-abi
Index: gcc.target/i386/shrink_wrap_1.c
===
--- gcc.target/i386/shrink_wrap_1.c (revision 215644)
+++ gcc.target/i386/shrink_wrap_1.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! ia32 } } } */
 /* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */

 enum machine_mode


Re: [COMMITTED][PATCH, 2/2] shrink wrap a function with a single loop: split live_edge

2014-09-26 Thread Jiong Wang


On 26/09/14 16:28, H.J. Lu wrote:

On Fri, Sep 26, 2014 at 8:14 AM, Jiong Wang  wrote:

could you mark that testcase using something like
"dg-require-effective-target lp64"?

I checked in this patch to skip it on ia32.


great, thanks !
-- Jiong





Re: [PATCH] fix hardreg_cprop to honor HARD_REGNO_MODE_OK.

2014-09-26 Thread Ilya Tocar
On 25 Sep 13:14, Jeff Law wrote:
> On 09/01/14 04:29, Ilya Tocar wrote:
> >>>
> >>>AVX512 added new 16 xmm registers (xmm16-xmm31).
> >>>Those registers require evex encoding.
> >>>Only 512-bit wide versions of instructions have evex encoding with
> >>>avx512f, but all versions have it with avx512vl.
> >>>Most instructions have same macroized pattern for 128/256/512 vector
> >>>length. They all use constraint 'v', which corresponds to
> >>>class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
> >>>256-bit case (avx512f) and allow it only in avx512vl case we have
> >>>HARD_REGNO_MODE_OK checking for regno being evex-only and
> >>>disallowing it if mode is not 512-bit.
> >>Generally this kind of thing has been handled by splitting the register
> >>class into two classes.  I strongly suspect there are numerous places where
> >>we assume that two regs in the same class are interchangeable.
> >I'm not sure that there are many places where we replace hard regs
> >without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK.
> >As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we
> >should always check when substituting hard reg. Why is regcprop
> >different, and what's the point of HARD_REGNO_MODE_OK if it is ignored
> >by some passes?
> >
> >>
> >>I realize that's going to require some work in the x86 machine description,
> >>but I think that's going to be a much better approach and save you work in
> >>the long run.
> >>
> >
> >This will approximately double sse.md, as we will need to split all
> >patterns with 512-bit versions in 2 (512 and 128/256 cases) and play
> >games with enabling/disabling alternatives depending on flags.
> >Are you sure that this better than honoring HARD_REGNO_MODE_OK?
> >As far as I understand, honoring  HARD_REGNO_MODE_OK shouldn't produce
> >worse code.
> I don't see how it doubles the size.  You split the class into two classes.
> Whatever letter your second class has, you use it in conjunction with 'v'
> that you're already using.  Note you do not need different alternatives, you
> use them in the same alternative.
I'm not sure how will this help. Consider
add, right now they are described in one pattern.
Now in AVX512F (without AVX512VL) case we can use xmm16 for V8DF, but not for
V2DF,V4DF. If we keep them in one pattern, they will have same
alternatives for all modes. So we will need to either
split V2DF,V4DF into separate pattern (doubling number of patterns), or
disallow particular modes depending on flags (what we do now).

> 
> It's not a question of performance, but of design.
Obviously, but I still fail to see why honoring HARD_REGNO_MODE_OK is
bad design. I suspect that even without avx512 changes not honoring it will
bite us sooner or later.
> I suspect you're really
> just at the tip of the iceberg with this stuff if you continue to go down
> the path of having registers in the same class, some of which are
> allocatable and some of which are not.
Having class where some registers are not available is an old approach:
Consider SSE_REGS class, where half of registers is not available in
32-bit case. Problem is with different modes being valid in those
registers, depending on flags. And it worked fine for previous
~year in gcc 4.9. In my opinion if we check in original patch we will
harm no one, and fix correctness problem. If we later discover some new
problem, that is not fixable by simple patch, we may rework all of avx512
implementation. As all bugs of this kind will never generate incorrect
code (all error will be caught by assembler), I see no reason not to
check it in.
> 
> The other approach that I believe has been taken has been to mark the new
> registers as fixed when compiling for hardware where they're not available.
> But I'm not sure offhand if that would be sufficient to fix this problem.
It will not help. Registers are available. Just some modes are not
supported.


Re: [PATCH] Fix PR63266: Keep track of impact of sign extension in bswap

2014-09-26 Thread Christophe Lyon
On 26 September 2014 04:25, Thomas Preud'homme
 wrote:
>> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
>> Sent: Thursday, September 25, 2014 10:08 PM
>>
>
>> While attempting to try this, I noticed that more precisely the test
>> is currently UNSUPPORTED on aarch64_be,
>> which is because check_effective_target_bswap only accepts istarget
>> aarch64-*-*.
>
> Ah yes, of course.
>
>>
>> I didn't try yet to change it into istarget aarch64*-*-*.
>
> It should probably be added no matter the result anyway, since this target 
> has bswap instructions.
>

Fixing check_effective_target_bswap to accept aarch64*-*-* makes the
test pass, so we should submit that patch.

I tried the other change you suggested, but it seem that
scan-tree-dump-times only matched 3 times. I did this in a bit of a
hurry though, so I may have done something wrong.
I'm not sure when I have time to look at that again, so I prefer to
give this little feedback now :-)

Christophe.

> Best regards,
>
> Thomas
>
>
>


Re: [shrink-wrap] should not sink instructions which may cause trap ?

2014-09-26 Thread Jeff Law

On 09/26/14 08:50, Jiong Wang wrote:



if (may_trap_p (x))
  don't sink this instruction.

 any comments?

Should be checking if x may throw internally instead.

Richard, thanks for the suggestion, have used insn_could_throw_p to do
the check,
which will only do the check when flag_exception and
flag_non_call_exception be true,
so those instruction could still be sink for normal c/c++ program.

Jeff,

below is the fix for pr49847.C regression on aarch64. I re-run
full test on
aarch64-none-elf bare metal, no regression.

bootstrap ok on x86, no regression on check-gcc/g++.

ok for trunk?


(re-sent with changelog entry)

gcc/

2014-09-26  Jiong Wang

 * shrink-wrap.c (move_insn_for_shrink_wrap): Check
"insn_could_throw_p" before
 sinking insn.

I think can_throw_internal, per Richi's recommendation is better.

Note that can_throw_internal keys off the existence of the EH landing 
pads for the particular insn.


If flag_exceptions is false (for example), then would not expect those 
landing pads to exist and the insn would not be considered as 
potentially throwing.


Can you test with can_throw_internal to verify it's behaviour and resubmit


jeff





Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Jeff Law

On 09/26/14 07:23, Thomas Schwinge wrote:

Hi!

On Fri, 26 Sep 2014 12:04:45 +0400, Maxim Ostapenko 
 wrote:

Thank you all for your help!

Done in r215633.

-Maxim
On 09/25/2014 11:05 PM, Jeff Law wrote:

On 09/23/14 01:14, Maxim Ostapenko wrote:



2014-09-04  Jakub Jelinek
 Max Ostapenko

 * common.opt: New option.
 * doc/invoke.texi: Describe new option.
 * gcc.c (execute): Don't free first string early, but at the end
 of the function.  Call retry_ice if compiler exited with
 ICE_EXIT_CODE.
 (main): Factor out common code.
 (print_configuration): New function.
 (files_equal_p): Likewise.
 (check_repro): Likewise.
 (run_attempt): Likewise.
 (do_report_bug): Likewise.
 (append_text): Likewise.
 (try_generate_repro): Likewise

Approved.  Please install.

Thanks for your patience,
Jeff


This is causing compiler warnings, respectively bootstrap errors:

 [...]
 ../../master/gcc/gcc.c: In function 'attempt_status run_attempt(const 
char**, const char*, const char*, int, int)':
 ../../master/gcc/gcc.c:6319:15: error: variable 'errmsg' set but not used 
[-Werror=unused-but-set-variable]
const char *errmsg;
^
 ../../master/gcc/gcc.c: At global scope:
 ../../master/gcc/gcc.c:6412:33: error: unused parameter 'prog' 
[-Werror=unused-parameter]
  try_generate_repro (const char *prog, const char **argv)
  ^
 cc1plus: all warnings being treated as errors
 Makefile:1040: recipe for target 'gcc.o' failed
 make[3]: *** [gcc.o] Error 1
 make[3]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc'
 Makefile:4285: recipe for target 'all-stage2-gcc' failed
 make[2]: *** [all-stage2-gcc] Error 2
 make[2]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
 Makefile:21561: recipe for target 'stage2-bubble' failed
 make[1]: *** [stage2-bubble] Error 2
 make[1]: Leaving directory 
'/media/erich/home/thomas/tmp/gcc/hurd/master.build'
 Makefile:892: recipe for target 'all' failed
 make: *** [all] Error 2

OK to fix as follows?  Only compile-tested, did not test the new
-freport-bug functionality.

[ ... ]
Please construct a ChangeLog and commit.  Thanks.

jeff



Re: [PATCH] Avoid an unused stack frame for -mprofile-kernel profiling on leaf functions.

2014-09-26 Thread David Edelsohn
On Fri, Sep 26, 2014 at 9:02 AM, Anton Blanchard  wrote:
>
> gcc/:
>
> 2014-09-25  Anton Blanchard  
>
> PR target/63354
> * config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled): New 
> function.
> * config/rs6000/linux64.h (TARGET_KEEP_LEAF_WHEN_PROFILED): Define.

Okay. LGTM.

Thanks!

David


Re: ptx preliminary address space fixes [1/4]

2014-09-26 Thread Richard Biener
On September 26, 2014 5:14:24 PM CEST, Bernd Schmidt  
wrote:
>On 09/26/2014 02:42 PM, Richard Biener wrote:
>> On Fri, Sep 26, 2014 at 2:28 PM, Bernd Schmidt
> wrote:
>>> On 09/26/2014 02:26 PM, Richard Biener wrote:

 On Fri, Sep 26, 2014 at 2:14 PM, Bernd Schmidt
>
 wrote:
>
> On 09/26/2014 02:05 PM, Richard Biener wrote:
>>
>>
>> If currently address-space support matches up with the C frontend
>> and the C standard then the middle-end has to cope with that.
>> In this case, cope with array element types not having
>address-space
>> qualifiers.
>
>
>
> That's the opposite of what happens. The C frontend makes array
>element
> types have address-space qualifiers but not the array type.


 Ah, ok.  Then the opposite way around ;)
>>>
>>>
>>> Ok, so that means that my original patch which updated the element
>types for
>>> arrays is in fact the way to go?
>>
>> It seems to do both, apply the as to the array _and_ the element
>type, no?
>
>Yes. I guess I could not do this, but then the patch will also have to 
>replace all but very few uses of TYPE_ADDR_SPACE outside the C frontend
>
>with a new addr_space_for_type function that checks for arrays.

You have the reference_addr_space function for that.

Richard.

>I can do that, but to me it feels like utterly the wrong way to go. If 
>you're sure that's what you want, I'll make a patch.
>
>
>Bernd




Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Thomas Schwinge
Hi!

On Fri, 26 Sep 2014 10:18:32 -0600, Jeff Law  wrote:
> On 09/26/14 07:23, Thomas Schwinge wrote:
> > On Fri, 26 Sep 2014 12:04:45 +0400, Maxim Ostapenko 
> >  wrote:
> >> Done in r215633.

> > This is causing compiler warnings, respectively bootstrap errors: [...]

> > OK to fix as follows?  Only compile-tested, did not test the new
> > -freport-bug functionality.
> [ ... ]
> Please construct a ChangeLog and commit.  Thanks.

After Maxim had sent his email about having successfully tested it, I had
already taken the opportunity to commit it: r215644.


Grüße,
 Thomas


pgpZ3wHFGTFNW.pgp
Description: PGP signature


Re: [PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-26 Thread Jeff Law

On 09/26/14 10:31, Thomas Schwinge wrote:

Hi!

On Fri, 26 Sep 2014 10:18:32 -0600, Jeff Law  wrote:

On 09/26/14 07:23, Thomas Schwinge wrote:

On Fri, 26 Sep 2014 12:04:45 +0400, Maxim Ostapenko 
 wrote:

Done in r215633.



This is causing compiler warnings, respectively bootstrap errors: [...]



OK to fix as follows?  Only compile-tested, did not test the new
-freport-bug functionality.

[ ... ]
Please construct a ChangeLog and commit.  Thanks.


After Maxim had sent his email about having successfully tested it, I had
already taken the opportunity to commit it: r215644.

Thanks.
jeff



Re: [PATCH] Put all MAINTAINERS email addresses into <...>

2014-09-26 Thread Jeff Law

On 09/25/14 15:10, Jan-Benedict Glaw wrote:

Hi!

Resending this email. Seems some spam filter ate it due to the many
email addresses...


Following up on my suggestion to put all email addresses into <...>
(cf. https://gcc.gnu.org/ml/gcc/2014-09/msg00298.html) here's an
actual patch. Quite a mechanical change, along with a few clean-ups of
space-before-tab and trailing whitespace:


2014-09-23  Jan-Benedict Glaw  

* MAINTAINERS: Put all email addresses between '<' and '>'.

Looks good to me.  Please install.
jeff



Re: [PATCH C++] - SD-6 Implementation Part 2 - __has_include macro and C++ language feature macros.

2014-09-26 Thread Jason Merrill

On 09/26/2014 10:20 AM, Ed Smith-Rowland wrote:

On 09/25/2014 01:40 PM, Jason Merrill wrote:

Don't we support attribute deprecated in C++11?


We support [[gnu::deprecated]] in C++11 bit not [[deprecated]] until C++14.


Hmm, that seems unnecessary.  I'd allow it in C++11 as well, and *maybe* 
complain if -pedantic; 7.6/5 says "For an attribute-token not specified 
in this International Standard, the behavior is implementation defined" 
so allowing it is conforming.


Jason



Re: [wwwdocs] Update C++1y status page now that C++14 is finished.

2014-09-26 Thread Gerald Pfeifer
On Wednesday 2014-09-24 08:24, Mike Stump wrote:
>> C++14 is no longer the next standard, it's here, so update the project
>> page.
> Can we have a web doc person update the name of the page 
> (projects/cxx1y.html -> projects/cxx14.html) and add a redirect 
> as necessary?

I'll do that over the weekend.

Gerald


Re: [debug-early] fix fortran regressions

2014-09-26 Thread Aldy Hernandez

On 09/26/14 01:29, Richard Biener wrote:


So - please try dropping push_cfun as you set current_function_decl
anyway.


Excellent.  Thanks for the clean-up suggestion.

I am also including a small fix to squelch a use-before-def problem.

Committing to branch.  Whine if in violent opposition.
Aldy
commit 7d371b0f69b8ff74a6fd17773dd5fde80687a698
Author: Aldy Hernandez 
Date:   Fri Sep 26 08:24:38 2014 -0700

* dwarf2out.c (dwarf2out_early_global_decl): Do not set cfun.
(gen_variable_die): Set origin_die before we exit early.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 26997b8..339e547 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19057,6 +19057,7 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
   gcc_assert (old_die->die_parent == context_die);
   var_die = old_die;
   old_die = NULL;
+  origin_die = NULL;
   goto gen_variable_die_location;
 }
 
@@ -20879,40 +20880,26 @@ dwarf2out_early_global_decl (tree decl)
   bool save = symtab->global_info_ready;
   symtab->global_info_ready = true;
 
-  bool fndecl_was_null = false;
   /* We don't handle TYPE_DECLs.  If required, they'll be reached via
  other DECLs and they can point to template types or other things
  that dwarf2out can't handle when done via dwarf2out_decl.  */
   if (TREE_CODE (decl) != TYPE_DECL
   && TREE_CODE (decl) != PARM_DECL)
 {
+  tree save_fndecl = current_function_decl;
   if (TREE_CODE (decl) == FUNCTION_DECL)
{
- /* A missing cfun means the symbol is unused and was removed
-from the callgraph.  */
+ /* A missing cfun means the symbol is unused.  */
  if (!DECL_STRUCT_FUNCTION (decl))
goto early_decl_exit;
 
- if (current_function_decl)
-   push_cfun (DECL_STRUCT_FUNCTION (decl));
- else
-   {
- set_cfun (DECL_STRUCT_FUNCTION (decl));
- fndecl_was_null = true;
-   }
  current_function_decl = decl;
}
   dw_die_ref die = dwarf2out_decl (decl);
   if (die)
die->dumped_early = true;
   if (TREE_CODE (decl) == FUNCTION_DECL)
-   {
- if (fndecl_was_null)
-   set_cfun (NULL);
- else
-   pop_cfun ();
- current_function_decl = NULL;
-   }
+   current_function_decl = save_fndecl;
 }
  early_decl_exit:
   symtab->global_info_ready = save;


[debug-early] reuse old DIE if it was dumped early

2014-09-26 Thread Aldy Hernandez
I'm not sure, but somewhere along the last few commits I caused a 
regression that inhibits libgfortran from building.  Serves me right for 
forgetting to rebuild all the target libraries after each patch.


For some instances of subprograms, we do not reuse the prexisting die, 
and we trigger the check I had added here:


  /* If we early created a DIE, make sure it didn't get re-created by
 mistake.  */
  if (early_die && early_die->dumped_early)
gcc_assert (early_die == die);

Since we already have the dumped_early bit set for subprograms, we can 
just check that and proceed to reuse the previously generated die.


Fixed.  Cleaned some other trivia.  Committed to branch.

No guality regressions for any languages.  All target libraries built.

Aldy
commit 10d7286a26e8a0e29ecbbd0b362ecb8fcc8bfc62
Author: Aldy Hernandez 
Date:   Fri Sep 26 11:11:27 2014 -0700

* dwarf2out.c (gen_subprogram_die): Use old die if it was dumped
early.
Rearrange some comments.
(gen_variable_die): Move initialization of origin_die closer to
use.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 339e547..41c4feb 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -18308,14 +18308,14 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
  && !get_AT (old_die, DW_AT_inline))
{
  /* Detect and ignore this case, where we are trying to output
-something we have already output.
-
-If we have no location information, this must be a
-partially generated DIE from early dwarf generation.
-Fall through and generate it.  */
+something we have already output.  */
  if (get_AT (old_die, DW_AT_low_pc)
  || get_AT (old_die, DW_AT_ranges))
  return;
+
+ /* If we have no location information, this must be a
+partially generated DIE from early dwarf generation.
+Fall through and generate it.  */
}
 
   /* If the definition comes from the same place as the declaration,
@@ -18325,11 +18325,12 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 instances of inlines, since the spec requires the out-of-line copy
 to have the same parent.  For local class methods, this doesn't
 apply; we just use the old DIE.  */
-  if ((is_cu_die (old_die->die_parent) || context_die == NULL)
- && (DECL_ARTIFICIAL (decl)
- || (get_AT_file (old_die, DW_AT_decl_file) == file_index
- && (get_AT_unsigned (old_die, DW_AT_decl_line)
- == (unsigned) s.line
+  if (old_die->dumped_early
+ || ((is_cu_die (old_die->die_parent) || context_die == NULL)
+ && (DECL_ARTIFICIAL (decl)
+ || (get_AT_file (old_die, DW_AT_decl_file) == file_index
+ && (get_AT_unsigned (old_die, DW_AT_decl_line)
+ == (unsigned) s.line)
{
  subr_die = old_die;
 
@@ -18926,7 +18927,6 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
   tree ultimate_origin;
   dw_die_ref var_die;
   dw_die_ref old_die = decl ? lookup_decl_die (decl) : NULL;
-  dw_die_ref origin_die;
   bool declaration = (DECL_EXTERNAL (decl_or_origin)
  || class_or_namespace_scope_p (context_die));
   bool specialization_p = false;
@@ -19050,6 +19050,8 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
   if (old_die && !declaration && !local_scope_p (context_die))
 return;
 
+  dw_die_ref origin_die = NULL;
+
   /* If a DIE was dumped early, it still needs location info.  Skip to
  the part where we fill the location bits.  */
   if (old_die && old_die->dumped_early)
@@ -19057,7 +19059,6 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
   gcc_assert (old_die->die_parent == context_die);
   var_die = old_die;
   old_die = NULL;
-  origin_die = NULL;
   goto gen_variable_die_location;
 }
 
@@ -19069,7 +19070,6 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
   else
 var_die = new_die (DW_TAG_variable, context_die, decl);
 
-  origin_die = NULL;
   if (origin != NULL)
 origin_die = add_abstract_origin_attribute (var_die, origin);
 


Re: [jit] Add a test of using very long names

2014-09-26 Thread Mike Stump
On Sep 26, 2014, at 8:14 AM, David Malcolm  wrote:
>   * jit.dg/test-long-names.c: New test case.

> +/* 65KB */
> +#define NAME_LENGTH (65 * 1024)

65K was a tiny name back in 1999, 16M was a large name then.  Today, 16M is 
tiny enough.  And yeah, this was a customer bug report, just normal C++ code 
with template manglings back then and yeah, we fixed the bug and tested it out 
to 16M to ensure we would not hit another bug in the next decade.  As far as I 
know, we didn’t.  If you want to ensure it works nicely for the next decade 
test out to, say, 128M and then throw that test case away.  I’d be curious if 
you hit any problems at 128M.

Re: FW: [PATCH] Fix PR preprocessor/58893 access to uninitialized memory

2014-09-26 Thread Jeff Law

On 09/26/14 06:21, Bernd Edlinger wrote:

>
>Hi,
>
>this patch fixes PR58893, which is an access to uninitialized memory, which 
may or may not crash in
>linemap_resolve_location, or just print error messages with bogus location.
>
>When the first -include file is processed we have the case, where
>pfile->cur_token == pfile->cur_run->base, this is directly called
>by the front end. However in the case of the second -include file,
>this is called from _cpp_lex_token -> _cpp_get_fresh_line ->
>cpp_push_include, with pfile->cur_token != pfile->cur_run->base,
>and pfile->cur_token[-1].src_loc and token not (yet) initialized.
>The problem is, when the include file cannot be found, we need
>src_loc to be initialized to some safe value: 0 means UNKNOWN_LOCATION.
>
>Regarding the hunk in cpp_diagnostic, which is not directly involved
>in this bug, but it is still obviously wrong:
>
>The line "src_loc = pfile->cur_run->prev->limit->src_loc"
>is probably unreachable, but will crash it is ever executed.
>
>see:
>
>_cpp_init_tokenrun (tokenrun *run, unsigned int count)
>{
>run->base = XNEWVEC (cpp_token, count);
>run->limit = run->base + count;
>run->next = NULL;
>}
>
>so, limit points at the end of the run.
>
>
>Boot-Strapped and Regression-tested on x86_64-linux-gnu
>Ok for trunk?
>
>
>Thanks
>Bernd.
>




changelog-pr58893.txt


2014-09-26  Bernd Edlinger

PR preprocessor/58893
* errors.c (cpp_diagnostic): Fix possible out of bounds access.
* files.c (_cpp_stack_include): Initialize src_loc for IT_CMDLINE.


patch-pr58893.diff


--- libcpp/errors.c 2014-01-02 23:24:45.0 +0100
+++ libcpp/errors.c 2014-09-24 10:30:33.708048505 +0200
@@ -48,10 +48,7 @@ cpp_diagnostic (cpp_reader * pfile, int
   current run -- that is invalid.  */
else if (pfile->cur_token == pfile->cur_run->base)
  {
-  if (pfile->cur_run->prev != NULL)
-   src_loc = pfile->cur_run->prev->limit->src_loc;
-  else
-   src_loc = 0;
+  src_loc = 0;
  }
else
  {
--- libcpp/files.c  2014-05-21 20:54:12.0 +0200
+++ libcpp/files.c  2014-09-24 10:35:47.191117490 +0200
@@ -991,6 +991,9 @@ _cpp_stack_include (cpp_reader *pfile, c
_cpp_file *file;
bool stacked;

+  if (type == IT_CMDLINE && pfile->cur_token != pfile->cur_run->base)
+pfile->cur_token[-1].src_loc = 0;
Comment before this change.  Someone not familiar with this code is 
going to have no idea why these two lines exist.


Please try to include a testcase.  If you're having trouble reproducing 
on the trunk, you could use MALLOC_PERTURB per c#8 in the bug report. 
If there's a way to set environment variables in our testing framework 
that may be a reasonable way to test (if you need to do that, limit 
testing to linux targets as we'll have a dependency on glibc features).


jeff



Enable TBAA on anonymous types with LTO

2014-09-26 Thread Jan Hubicka
Hello,
this is patch to preserve TBAA for anonymous types to LTO.  The difference
can be seen on the testcase:

namespace
{
  struct A {int a;};
  struct B {int b;};
}

struct A aa,*a=&aa;
struct B bb,*b=&bb;

void
setA()
{
  a->a=1;
}
void
setB()
{
  b->b=2;
}
int
main()
{
  asm("":"=r"(a),"=r"(b));
  setA();
  setB();
  if (!__builtin_constant_p (a->a))
__builtin_abort ();
  return 0;
}

With patch it does get properly optimized with -O2 -fno-early-inlining.

The basic idea is to:
  1) stream canonical types when they are anonymous
 (and thus need not be structurally merged)
  2) update canonical type hash so it can deal with types that already have
 canonical type set.
 I insert even anonymous types there because I am not able to get rid
 of cases where non-anonmous type explicitly mentions anonymous. Consider:
  namespace {
  struct B {};
  }
  struct A
  {
void t(B);
void t2();
  };
  void
  A::t(B)
  {
  }
  void
  A::t2()
  {
  }
 Here we end up having type of method T non-anonymous but it builds from B 
that
 is anonymous.

 Being bale to handle non-upwards closed cases will be needed soon for full 
ODR
 type handling
  3) Disable tree merging of anonymous namespace nodes and anonymous types.  
The second
 is needed, because I can have two identically looking anonymous types from
 same unit with different canonical types.

 This may go away once we get some ability to decide on unmergability at
 stream out time.

I do not attept to merge anonymous types with structurally equivalent
non-anonymous types from other compilation units.  I think it is nature of C++
language that types in anonymous namespaces can not be accessed by other units
and I hope to use this for other optimizations, too.

We can add documentation about this to -fstrict-aliasing section of manual I 
guess.

What I am concerned about is the needed change in c-decl.c.  C frontend 
currently
outputs declarations that are confused by type_in_anonymous_namespace_p as 
anonymous
in some cases.  This is because it does not set PUBLIC flag on TYPE decl.  This 
is
bug:
/* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
   nonzero means name is to be accessible from outside this translation unit.
   In an IDENTIFIER_NODE, nonzero means an external declaration
   accessible from outside this translation unit was previously seen
   for this name in an inner scope.  */
#define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)

This fortunately manifests itself as false warnings about type incompatiblity
from lto-symtab.  I did not see these with other languages, but I suppose we 
will
want to check that other FEs are behaving correctly here.
I do not know how Ada and Fortran should behave here.

Bootstrapped/regtested x86_64-linux, lto-bootstrapped, tested with Firefox and
libreoffice. I also checked that tree merging is working still well. OK?

Honza

* c-decl.c (pushtag): Set TREE_PUBLIC on STUB DECL>

* lto-streamer-out.c (DFS::DFS_write_tree_body): Optinally stream
TYPE_CANONICAL.

* lto.c (iterative_hash_canonical_type): Handle cases where 
TYPE_CANONICAL
is pre-set.
(gimple_register_canonical_type_1): Likewise.
(lto_register_canonical_types): Likewise.
(compare_tree_sccs_1): Anonymous namespaces never compare;
neither does types in anonymous namespace.
(lto_read_decls): Do not check TYPE_CANONICAL.
* tree-streamer-out.c (write_ts_type_common_tree_pointers): Optinally 
write
TYPE_CANONICAL.
* lto-streamer-in.c (lto_read_body_or_constructor): Handle case
where TYPE_CANONICAL is pre-set.
* tree-streamer-in.c (lto_input_ts_type_common_tree_pointers): Stream
in TYPE_CANONICAL.

Index: c/c-decl.c
===
--- c/c-decl.c  (revision 215645)
+++ c/c-decl.c  (working copy)
@@ -1466,6 +1466,7 @@ pushtag (location_t loc, tree name, tree
   /* An approximation for now, so we can tell this is a function-scope tag.
  This will be updated in pop_scope.  */
   TYPE_CONTEXT (type) = DECL_CONTEXT (TYPE_STUB_DECL (type));
+  TREE_PUBLIC (TYPE_STUB_DECL (type)) = 1;
 
   if (warn_cxx_compat && name != NULL_TREE)
 {
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 215645)
+++ lto-streamer-out.c  (working copy)
@@ -600,8 +600,11 @@ DFS::DFS_write_tree_body (struct output_
 during fixup.  */
   DFS_follow_tree_edge (TYPE_MAIN_VARIANT (expr));
   DFS_follow_tree_edge (TYPE_CONTEXT (expr));
-  /* TYPE_CANONICAL is re-computed during type merging, so no need
-to follow it here.  */
+  /* For non-anonymous types, TYPE_CANONICAL is re-computed during type
+merging, so no need to follow it here.  */
+  if (TYPE_CANONICAL

Re: Avoid privatization of TLS variables

2014-09-26 Thread Jan Hubicka
> On Fri, Sep 26, 2014 at 04:17:14AM +0200, Jan Hubicka wrote:
> > I was building libreoffice with profile feedback and I run into a message
> > 
> > cannot load any more object with static TLS
> > 
> > that took me a while to track as I did not see where static TLS is comming 
> > out.
> > Ian pointed out to me that static variables with TLS storage also consume
> > static TLS even if they are in dynamic model.  This is why I disabled
> > localization.  Is there better way to handle this?
> 
> Fix a glibc bug?  It has been a while since I looked into glibc in
> any depth regarding TLS (2011-03), but I believe the l_tls_modid test
> here
> if (! RTLD_SINGLE_THREAD_P && imap->l_tls_modid > DTV_SURPLUS)
>   _dl_signal_error (0, "dlopen", NULL, N_("\
> cannot load any more object with static TLS"));
> 
> is wrong.  The test is saying "if we have loaded a certain number of
> dynamic objects with TLS segments, refuse to dlopen any more
> containing TLS if we are multi-threaded".
> 
> What it should be saying is "if we have loaded a certain number of
> dynamic objects with TLS segments *after we went multi-threaded*,
> refuse to open any more".  In particular, any dynamic objects with TLS
> segments loaded at program startup should not be counted.  This is
> because DTV_SURPLUS *extra* slots are allocated above those needed at
> program startup.  At least, that's how I think it works.

Yeah, this also looks like very good idea to do (and would solve several
practical issues with this limit that I saw while googling for it). 

Still if someone dlopens bazzilion of shared libraries built with profile
feedback and does so after going multithreaded, it should not hit the limit. So
I think we need GCC side solution too.

Honza
> 
> -- 
> Alan Modra
> Australia Development Lab, IBM


[PATCH, rs6000, committed] Fix effective target in gcc.target/powerpc/pr63335.c

2014-09-26 Thread Bill Schmidt
Hi,

I goofed on the effective target in the subject test, checking only if
it was ok to produce VSX instructions, not whether we were running on
hardware with VSX support.  The latter is needed.  This triggered a
failure on the VSX-less regression tester.  Fix committed as obvious.

Thanks,
Bill


2014-09-26  Bill Schmidt  

* gcc.target/powerpc/pr63335.c: Change effective target to
vsx_hw.


Index: gcc/testsuite/gcc.target/powerpc/pr63335.c
===
--- gcc/testsuite/gcc.target/powerpc/pr63335.c  (revision 215645)
+++ gcc/testsuite/gcc.target/powerpc/pr63335.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run { target { powerpc64*-*-* } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target vsx_hw } */
 /* { dg-options "-mvsx" } */
 
 #include 




Re: [jit] Add a test of using very long names

2014-09-26 Thread David Malcolm
On Fri, 2014-09-26 at 11:45 -0700, Mike Stump wrote:
> On Sep 26, 2014, at 8:14 AM, David Malcolm 
> wrote:
> > * jit.dg/test-long-names.c: New test case.
> 
> > +/* 65KB */
> > +#define NAME_LENGTH (65 * 1024)
> 
> 65K was a tiny name back in 1999, 16M was a large name then.  Today,
> 16M is tiny enough.  And yeah, this was a customer bug report, just
> normal C++ code with template manglings back then and yeah, we fixed
> the bug and tested it out to 16M to ensure we would not hit another
> bug in the next decade.  As far as I know, we didn’t.  If you want to
> ensure it works nicely for the next decade test out to, say, 128M and
> then throw that test case away.  I’d be curious if you hit any
> problems at 128M.

Out of curiosity I tried upping NAME_LENGTH to 129M.

The compiler handled it fine, but FWIW "as" seems to be stuck here:

(gdb) bt
#0  0x00411730 in input_scrub_next_buffer (bufp=bufp@entry=0x693340) at 
input-scrub.c:390
#1  0x0041efab in read_a_source_file (name=) at 
read.c:768
#2  0x00404188 in perform_an_assembly_pass (argv=0x88bee8, 
argc=) at as.c:1095
#3  main (argc=2, argv=0x88bee0) at as.c:1242

whilst reading a 952M .s file.

(binutils-2.23.88.0.1-13.fc20.x86_64, fwiw)




C++ PATCH for abi_tag on builtin mangling abbreviations

2014-09-26 Thread Jason Merrill
We were ignoring abi_tags on the standard mangling abbreviations, such 
as Ss = std::string.  This patch fixes that so that we append tags to 
the abbreviation, and add the result as a substitution.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9e8c617e9135989151ba0a8d6f0d38c7913b2f23
Author: Jason Merrill 
Date:   Mon Aug 11 10:13:37 2014 -0400

gcc/cp/
	* mangle.c (is_std_substitution): Check for abi_tag.

libiberty/
	* cp-demangle.c (d_substitution): Handle abi tags on abbreviation.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 9703d1c..4f94c19 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -512,6 +512,7 @@ find_substitution (tree node)
   const int size = vec_safe_length (G.substitutions);
   tree decl;
   tree type;
+  const char *abbr = NULL;
 
   if (DEBUG_MANGLE)
 fprintf (stderr, "  ++ find_substitution (%s at %p)\n",
@@ -530,13 +531,10 @@ find_substitution (tree node)
   if (decl
   && is_std_substitution (decl, SUBID_ALLOCATOR)
   && !CLASSTYPE_USE_TEMPLATE (TREE_TYPE (decl)))
-{
-  write_string ("Sa");
-  return 1;
-}
+abbr = "Sa";
 
   /* Check for std::basic_string.  */
-  if (decl && is_std_substitution (decl, SUBID_BASIC_STRING))
+  else if (decl && is_std_substitution (decl, SUBID_BASIC_STRING))
 {
   if (TYPE_P (node))
 	{
@@ -555,26 +553,20 @@ find_substitution (tree node)
 	   SUBID_CHAR_TRAITS)
 		  && is_std_substitution_char (TREE_VEC_ELT (args, 2),
 	   SUBID_ALLOCATOR))
-		{
-		  write_string ("Ss");
-		  return 1;
-		}
+		abbr = "Ss";
 	}
 	}
   else
 	/* Substitute for the template name only if this isn't a type.  */
-	{
-	  write_string ("Sb");
-	  return 1;
-	}
+	abbr = "Sb";
 }
 
   /* Check for basic_{i,o,io}stream.  */
-  if (TYPE_P (node)
-  && cp_type_quals (type) == TYPE_UNQUALIFIED
-  && CLASS_TYPE_P (type)
-  && CLASSTYPE_USE_TEMPLATE (type)
-  && CLASSTYPE_TEMPLATE_INFO (type) != NULL)
+  else if (TYPE_P (node)
+	   && cp_type_quals (type) == TYPE_UNQUALIFIED
+	   && CLASS_TYPE_P (type)
+	   && CLASSTYPE_USE_TEMPLATE (type)
+	   && CLASSTYPE_TEMPLATE_INFO (type) != NULL)
 {
   /* First, check for the template
 	 args  > .  */
@@ -587,35 +579,29 @@ find_substitution (tree node)
 	{
 	  /* Got them.  Is this basic_istream?  */
 	  if (is_std_substitution (decl, SUBID_BASIC_ISTREAM))
-	{
-	  write_string ("Si");
-	  return 1;
-	}
+	abbr = "Si";
 	  /* Or basic_ostream?  */
 	  else if (is_std_substitution (decl, SUBID_BASIC_OSTREAM))
-	{
-	  write_string ("So");
-	  return 1;
-	}
+	abbr = "So";
 	  /* Or basic_iostream?  */
 	  else if (is_std_substitution (decl, SUBID_BASIC_IOSTREAM))
-	{
-	  write_string ("Sd");
-	  return 1;
-	}
+	abbr = "Sd";
 	}
 }
 
   /* Check for namespace std.  */
-  if (decl && DECL_NAMESPACE_STD_P (decl))
+  else if (decl && DECL_NAMESPACE_STD_P (decl))
 {
   write_string ("St");
   return 1;
 }
 
+  tree tags = NULL_TREE;
+  if (OVERLOAD_TYPE_P (node))
+tags = lookup_attribute ("abi_tag", TYPE_ATTRIBUTES (type));
   /* Now check the list of available substitutions for this mangling
  operation.  */
-  for (i = 0; i < size; ++i)
+  if (!abbr || tags) for (i = 0; i < size; ++i)
 {
   tree candidate = (*G.substitutions)[i];
   /* NODE is a matched to a candidate if it's the same decl node or
@@ -630,8 +616,17 @@ find_substitution (tree node)
 	}
 }
 
-  /* No substitution found.  */
-  return 0;
+  if (!abbr)
+/* No substitution found.  */
+return 0;
+
+  write_string (abbr);
+  if (tags)
+{
+  write_abi_tags (tags);
+  add_substitution (node);
+}
+  return 1;
 }
 
 
diff --git a/gcc/testsuite/g++.dg/abi/abi-tag9.C b/gcc/testsuite/g++.dg/abi/abi-tag9.C
new file mode 100644
index 000..9ec78a9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/abi-tag9.C
@@ -0,0 +1,11 @@
+// { dg-final { scan-assembler "_Z1fSsB3fooS_" } }
+
+namespace std {
+  template  struct char_traits {};
+  template  struct allocator {};
+  template 
+  struct __attribute ((abi_tag ("foo"))) basic_string { };
+  typedef basic_string,allocator > string;
+}
+
+void f(std::string,std::string) {}
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 4ecdb1e..db3d679 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -3702,7 +3702,13 @@ d_substitution (struct d_info *di, int prefix)
 		  len = p->simple_len;
 		}
 	  di->expansion += len;
-	  return d_make_sub (di, s, len);
+	  struct demangle_component *p = d_make_sub (di, s, len);
+	  if (d_peek_char (di) == 'B')
+		{
+		  p = d_abi_tags (di, p);
+		  d_add_substitution (di, p);
+		}
+	  return p;
 	}
 	}
 
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index f8420ef..a030685 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@

Re: [Patch, MIPS] Add .note.GNU-stack section

2014-09-26 Thread Steve Ellcey
On Wed, 2014-09-10 at 10:15 -0700, Eric Christopher wrote:
> 
> 
> On Wed, Sep 10, 2014 at 9:27 AM,  wrote:

> This works except you did not update the assembly files in
> libgcc or glibc. We (Cavium) have the same patch in our tree
> for a few released versions.

> Mind just checking yours in then Andrew?

> Thanks!
> -eric

I talked to Andrew about what files he changed in GCC and created and
tested this new patch.  Andrew also mentioned changing some assembly
files in glibc but I don't see any use of '.section .note.GNU-stack' in
any assembly files in glibc (for any platform) so I wasn't planning on
creating a glibc to add them to mips glibc assembly language files.

OK to check in this patch?

Steve Ellcey
sell...@mips.com



2014-09-26  Steve Ellcey  

* config/mips/mips.c (TARGET_ASM_FILE_END): Define.
* libgcc/config/mips/mips16.S: Add .note.GNU-stack section.
* libgcc/config/mips/vr4120-div.S: Ditto.
* libgcc/config/mips/crti.S: Ditto.
* libgcc/config/mips/crtn.S: Ditto.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index f9713c1..39020d7 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -19146,6 +19146,9 @@ mips_lra_p (void)
 #undef TARGET_LRA_P
 #define TARGET_LRA_P mips_lra_p
 
+#undef TARGET_ASM_FILE_END
+#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-mips.h"
diff --git a/libgcc/config/mips/crti.S b/libgcc/config/mips/crti.S
index 6980594..93436c0 100644
--- a/libgcc/config/mips/crti.S
+++ b/libgcc/config/mips/crti.S
@@ -21,6 +21,10 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
+
+/* An executable stack is *not* required for these functions.  */
+   .section .note.GNU-stack,"",%progbits
+
 /* 4 slots for argument spill area.  1 for cpreturn, 1 for stack.
Return spill offset of 40 and 20.  Aligned to 16 bytes for n32.  */
 
diff --git a/libgcc/config/mips/crtn.S b/libgcc/config/mips/crtn.S
index 0de2d0c..6f2c301 100644
--- a/libgcc/config/mips/crtn.S
+++ b/libgcc/config/mips/crtn.S
@@ -21,6 +21,9 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
+/* An executable stack is *not* required for these functions.  */
+.section .note.GNU-stack,"",%progbits
+
 /* 4 slots for argument spill area.  1 for cpreturn, 1 for stack.
Return spill offset of 40 and 20.  Aligned to 16 bytes for n32.  */
 
diff --git a/libgcc/config/mips/mips16.S b/libgcc/config/mips/mips16.S
index dde8939..58e4377 100644
--- a/libgcc/config/mips/mips16.S
+++ b/libgcc/config/mips/mips16.S
@@ -35,6 +35,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
values using the soft-float calling convention, but do the actual
operation using the hard floating point instructions.  */
 
+/* An executable stack is *not* required for these functions.  */
+.section .note.GNU-stack,"",%progbits
+   .previous
+
 #if defined _MIPS_SIM && (_MIPS_SIM == _ABIO32 || _MIPS_SIM == _ABIO64)
 
 /* This file contains 32-bit assembly code.  */
diff --git a/libgcc/config/mips/vr4120-div.S b/libgcc/config/mips/vr4120-div.S
index 76c4e7a..664d3c3 100644
--- a/libgcc/config/mips/vr4120-div.S
+++ b/libgcc/config/mips/vr4120-div.S
@@ -26,6 +26,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
-mfix-vr4120.  div and ddiv do not give the correct result when one
of the operands is negative.  */
 
+/* An executable stack is *not* required for these functions.  */
+.section .note.GNU-stack,"",%progbits
+   .previous
+
.setnomips16
 
 #define DIV\




Re: [Patch] Switch elimination pass for PR 54742

2014-09-26 Thread Sebastian Pop
Jeff Law wrote:
> On 08/21/14 04:30, Richard Biener wrote:
> >>It turns Jeff's jump-threading code in to a strange franken-pass of bits and
> >>pieces of detection and optimisation, and would need some substantial
> >>reworking to fit in with Jeff's changes last Autumn, but if it is more
> >>likely to be acceptable for trunk then perhaps we could look to revive it.
> >>It would be nice to reuse the path copy code Jeff added last year, but I
> >>don't have much intuition as to how feasible that is.
> >>
> >>Was this the sort of thing that you were imagining?
> >
> >Yeah, didn't look too closely though.
> It'd be pretty ugly I suspect.  But it's probably worth pondering
> since that approach would eliminate the concerns about the cost of
> detection (which is problematical for the jump threader) by using
> Steve's code for that.
> 
> On the update side, I suspect most, if not all of the framework is
> in place to handle this kind of update if the right threading paths
> were passed to the updater.  I can probably cobble together that
> by-hand and see what the tree-ssa-threadupdate does with it.  But
> it'll be a week or so before I could look at it.

I adapted the patch James has sent last year to use the new update paths
mechanism.  I verified that the attached patch does register all the paths that
need to be threaded.  Thread updater seems to have some problems handling the
attached testcase (a simplified version of the testcase attached to the bug.)

Jeff, could you please have a look at why the jump-thread updater is crashing?

Let me know if you want me to continue looking at the problem.

Thanks,
Sebastian
>From 1f09b819559865be5a366e11a9c0f9bf495f91bc Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Fri, 26 Sep 2014 14:54:20 -0500
Subject: [PATCH] jump thread for PR 54742

Adapted from a patch from James Greenhalgh.

	* Makefile.in: Add dependence on pointer-set.o.

	* tree-ssa-threadedge.c: Include pointer-set.h.
	(simplify_control_stmt_condition): Restore the original value of cond
	when simplification fails.
	(find_thread_path): New.
	(find_control_statement_thread_paths): New.
	(thread_through_normal_block): Call find_control_statement_thread_paths.

	* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.
---
 gcc/Makefile.in  |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |  32 +
 gcc/tree-ssa-threadedge.c| 170 ++-
 3 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6f251a5..ebaed55 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1310,6 +1310,7 @@ OBJS = \
 	opts-global.o \
 	passes.o \
 	plugin.o \
+	pointer-set.o \
 	postreload-gcse.o \
 	postreload.o \
 	predict.o \
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
new file mode 100644
index 000..f3ef725
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -0,0 +1,32 @@
+int sum0, sum1, sum2, sum3;
+int foo(char * s, char** ret)
+{
+  int state=0;
+  char c;
+
+  for (; *s && state != 4; s++)
+{
+  c = *s;
+  if (c == '*')
+	{
+	  s++;
+	  break;
+	}
+  switch (state) {
+	case 0:
+	  if (c == '+') state = 1;
+	  else if (c != '-') sum0+=c;
+	  break;
+	case 1:
+	  if (c == '+') state = 2;
+	  else if (c == '-') state = 0;
+	  else sum1+=c;
+	  break;
+	default:
+	  break;
+  }
+
+}
+  *ret = s;
+  return state;
+}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 3dee5ba..ee09841 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-ssa-threadedge.h"
 #include "builtins.h"
+#include "pointer-set.h"
 
 /* To avoid code explosion due to jump threading, we limit the
number of statements we are going to copy.  This variable
@@ -628,6 +629,7 @@ simplify_control_stmt_condition (edge e,
  rather than use a relational operator.  These are simpler to handle.  */
   if (TREE_CODE (cond) == SSA_NAME)
 {
+  tree original_lhs = cond;
   cached_lhs = cond;
 
   /* Get the variable's current value from the equivalence chains.
@@ -656,6 +658,12 @@ simplify_control_stmt_condition (edge e,
 	 pass specific callback to try and simplify it further.  */
   if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
 cached_lhs = (*simplify) (stmt, stmt);
+
+  /* We couldn't find an invariant.  But, callers of this
+	 function may be able to do something useful with the
+	 unmodified destination.  */
+  if (!cached_lhs)
+	cached_lhs = original_lhs;
 }
   else
 cached_lhs = NULL;
@@ -915,6 +923,145 @@ thread_around_empty_blocks (edge taken_edge,
   return false;
 }
 
+/* Return true if there is at least one path from START_BB to END_BB.
+   VISITE

Re: [Patch, Fortran] Add CO_BROADCAST

2014-09-26 Thread Andreas Schwab
Tobias Burnus  writes:

> diff --git a/gcc/testsuite/gfortran.dg/coarray_collectives_9.f90 
> b/gcc/testsuite/gfortran.dg/coarray_collectives_9.f90
> new file mode 100644
> index 000..90c09c5
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/coarray_collectives_9.f90
> @@ -0,0 +1,62 @@
> +! { dg-do compile }
> +! { dg-options "-fcoarray=single" }
> +!
> +!
> +! CO_BROADCAST/CO_REDUCE
> +!
> +program test
> +  implicit none
> +  intrinsic co_broadcast
> +  intrinsic co_reduce
> +  integer :: val, i
> +  integer :: vec(3), idx(3)
> +  character(len=30) :: errmsg
> +  integer(8) :: i8
> +  character(len=19, kind=4) :: msg4
> +
> +  interface
> +pure function red_f(a, b)
> +  integer :: a, b, red_f
> +  intent(in) :: a, b
> +end function red_f
> +impure function red_f2(a, b)
> +  integer :: a, b, red_f
> +  intent(in) :: a, b
> +end function red_f2
> +  end interface
> +
> +  call co_broadcast("abc") ! { dg-error "Missing actual argument 
> 'source_image' in call to 'co_broadcast'" }
> +  call co_reduce("abc") ! { dg-error "Missing actual argument 'operator' in 
> call to 'co_reduce'" }
> +  call co_broadcast(1, source_image=1) ! { dg-error "'a' argument of 
> 'co_broadcast' intrinsic at .1. must be a variable" }
> +  call co_reduce(a=1, operator=red_f) ! { dg-error "'a' argument of 
> 'co_reduce' intrinsic at .1. must be a variable" }
> +  call co_reduce(a=val, operator=red_f2) ! { dg-error "OPERATOR argument at 
> (1) must be a PURE function" }
> +
> +  call co_broadcast(val, source_image=[1,2]) ! { dg-error "must be a scalar" 
> }
> +  call co_broadcast(val, source_image=1.0) ! { dg-error "must be INTEGER" }
> +  call co_broadcast(val, 1, stat=[1,2]) ! { dg-error "must be a scalar" }
> +  call co_broadcast(val, 1, stat=1.0) ! { dg-error "must be INTEGER" }
> +  call co_broadcast(val, 1, stat=1) ! { dg-error "must be a variable" }
> +  call co_broadcast(val, stat=i, source_image=1) ! OK
> +  call co_broadcast(val, stat=i, errmsg=errmsg, source_image=1) ! OK
> +  call co_broadcast(val, stat=i, errmsg=[errmsg], source_image=1) ! { 
> dg-error "must be a scalar" }
> +  call co_broadcast(val, stat=i, errmsg=5, source_image=1) ! { dg-error 
> "must be CHARACTER" }
> +  call co_broadcast(val, 1, errmsg="abc") ! { dg-error "must be a variable" }
> +  call co_broadcast(val, 1, stat=i8) ! { dg-error "The stat= argument at .1. 
> must be a kind=4 integer variable" }
> +  call co_broadcast(val, 1, errmsg=msg4) ! { dg-error "The errmsg= argument 
> at .1. must be a default-kind character variable" }
> +
> +  call co_reduce(val, red_f, result_image=[1,2]) ! { dg-error "must be a 
> scalar" }
> +  call co_reduce(val, red_f, result_image=1.0) ! { dg-error "must be 
> INTEGER" }
> +  call co_reduce(val, red_f, stat=[1,2]) ! { dg-error "must be a scalar" }
> +  call co_reduce(val, red_f, stat=1.0) ! { dg-error "must be INTEGER" }
> +  call co_reduce(val, red_f, stat=1) ! { dg-error "must be a variable" }
> +  call co_reduce(val, red_f, stat=i, result_image=1) ! OK
> +  call co_reduce(val, red_f, stat=i, errmsg=errmsg, result_image=1) ! OK
> +  call co_reduce(val, red_f, stat=i, errmsg=[errmsg], result_image=1) ! { 
> dg-error "must be a scalar" }
> +  call co_reduce(val, red_f, stat=i, errmsg=5, result_image=1) ! { dg-error 
> "must be CHARACTER" }
> +  call co_reduce(val, red_f, errmsg="abc") ! { dg-error "must be a variable" 
> }
> +  call co_reduce(val, red_f, stat=i8) ! { dg-error "The stat= argument at 
> .1. must be a kind=4 integer variable" }
> +  call co_reduce(val, red_f, errmsg=msg4) ! { dg-error "The errmsg= argument 
> at .1. must be a default-kind character variable" }
> +
> +  call co_broadcasr(vec(idx), 1) ! { dg-error "Argument 'A' with 
> INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a 
> vector subscript" }
> +  call co_reduce(vec([1,3,2]), red_f) ! { dg-error "Argument 'A' with 
> INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a 
> vector subscript" }
> +end program test

FAIL: gfortran.dg/coarray_collectives_9.f90   -O   (test for errors, line 32)
FAIL: gfortran.dg/coarray_collectives_9.f90   -O   (test for errors, line 57)
FAIL: gfortran.dg/coarray_collectives_9.f90   -O   (test for errors, line 58)
FAIL: gfortran.dg/coarra

[jit] Experimental in-process embedding of the gcc driver into the jit

2014-09-26 Thread David Malcolm
On Tue, 2014-09-23 at 23:27 +, Joseph S. Myers wrote:
[...]
> The code for compiling a .s file should:
[...]
> * use libiberty's pexecute to run subprocesses, not "system" (building up 
> a string to pass to the shell always looks like a security hole, though in 
> this case it may in fact be safe);
> 
> * use the $(target_noncanonical)-gcc-$(version) name for the driver rather 
> than plain "gcc", to maximise the chance that it is actually the same 
> compiler the JIT library was built for (I realise you may not actually 
> depend on it being the same compiler, but that does seem best; in 
> principle in future it should be possible to load multiple copies of the 
> JIT library to JIT for different targets, so that code for an offload 
> accelerator can go through the JIT).
[...]

The JIT generates assembler, but needs to generate a shared library.
Currently it invokes a "gcc" driver binary to go from .s to .so

I had the idea of turning the driver code in gcc.c into a library
and using it directly, in-process.

This experiment renames "main" in gcc.c to "driver_main" for use by
the insides of the JIT library, adding a gcc-main.c with a "main" that
simply calls "driver_main" (rather like how we have main.c calling
toplev_main for use by cc1 etc).

I can then call driver_main from inside the JIT library, and call a
new "driver_finalize" function to try to cleanup state in gcc.c enough
to support repeated calls.

I have to set LIBRARY_PATH so that the "ln" invocation can find -lgcc
and -lgcc_s:

  LD_LIBRARY_PATH=. \
  
LIBRARY_PATH=../../install/lib/gcc/x86_64-unknown-linux-gnu/5.0.0:../../install/lib
 \
  gdb --args \
testsuite/jit/test-factorial.exe

I also pass -fno-use-linker-plugin to driver_main to stop path issues
locating that.

This works for 5 or 6 in-process iterations, but eventually dies with:
  test-factorial.exe: error trying to exec 'ld': execvp: Argument list too long

LIBRARY_PATH in the process' environment gets crazily long; what I think
is happening is that gcc.c uses getenv("LIBRARY_PATH"), processes the result
somewhat, then uses putenv("LIBRARY_PATH"), leading to (I think) an
exponential explosion in the length of LIBRARY_PATH in the process's env
(and the eventual failure seen above).

Other than that... my simple testcase seems to work.

In crude perftesting it currently seems to be slightly *slower*; e.g.:

Using driver_main, buggily:
 assemble JIT code   :   0.00 ( 0%) usr   0.04 (80%) sys   0.18 (49%) wall  
 0 kB ( 0%) ggc
 TOTAL :   0.16 0.05 0.37   
2348 kB

Using a subprocess:
 assemble JIT code   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.15 (48%) wall  
 0 kB ( 0%) ggc
 TOTAL :   0.14 0.02 0.31   
2348 kB

Perhaps this is because of the env accumulation bug above, or maybe I'm
doing an apples-to-oranges comparison somewhere.

Not committing this; just posting it for discussion and archival.

gcc/ChangeLog.jit:
* Makefile.in (GCC_OBJS): Add gcc-main.o.
* gcc/gcc-main.c: New file, implementing just a "main".
* gcc.c (main): Rename to...
(driver_main): ...this.
(driver_finalize): New function.
* gcc.h (driver_main): New prototype.
(driver_finalize): Likewise.

gcc/jit/ChangeLog.jit:
* Make-lang.in (jit_OBJS): Add gcc.o so that we can use
driver_main, and jitspec.o to provide implementations of functions
needed by it.  Put .o files on individual lines, sorted
alphabetically.
(LIBGCCJIT_FILENAME): Add $(EXTRA_GCC_OBJS) $(EXTRA_GCC_LIBS) to
the linkage line (and deps), so that we can pick up the
config's implementation of "host_detect_local_cpu", which is
needed by gcc.o.
* internal-api.c: Include gcc.h.
(gcc::jit::playback::context::compile): Rewrite invocation of
assembler and linker to simply call into driver_main in-process,
rather than invoking "gcc".  Keep the old code around for now
for performance testing.
* jitspec.c: New file, providing implementations of functions
and variables needed by gcc.o: functions lang_specific_driver
and lang_specific_pre_link, and variable
lang_specific_extra_outfiles.
* notes.txt: Show the invocation of driver_main and
driver_finalize.
---
 gcc/Makefile.in|  2 +-
 gcc/gcc-main.c | 31 +++
 gcc/gcc.c  | 20 +---
 gcc/gcc.h  |  7 +++
 gcc/jit/Make-lang.in   | 11 +--
 gcc/jit/internal-api.c | 38 +-
 gcc/jit/jitspec.c  | 42 ++
 gcc/jit/notes.txt  | 11 +--
 8 files changed, 149 insertions(+), 13 deletions(-)
 create mode 100644 gcc/gcc-main.c
 create mode 100644 gcc/jit/jitspec.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f56

RFA: one more version of the patch for PR61360

2014-09-26 Thread Vladimir Makarov

I guess we achieved the consensus about the following patch to fix PR61360

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

The patch was successfully bootstrapped and tested (w/wo 
-march=amdfam10) on x86/x86-64.


Is it ok to commit to trunk?

2014-09-26  Vladimir Makarov  

PR target/61360
* lra.c (lra): Remove call of recog_init.
* recog.c (constrain_operands): Permit reg for memory constraint
when LRA is used.
* config/i386/i386.md (*float2_sse):
Enable first alternative independently on RA stage.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 215337)
+++ config/i386/i386.md (working copy)
@@ -4796,13 +4796,8 @@
&& X87_ENABLE_FLOAT (mode,
 mode)")
 (eq_attr "alternative" "1")
-  /* ??? For sched1 we need constrain_operands to be able to
- select an alternative.  Leave this enabled before RA.  */
   (symbol_ref "TARGET_INTER_UNIT_CONVERSIONS
-   || optimize_function_for_size_p (cfun)
-   || !(reload_completed
-|| reload_in_progress
-|| lra_in_progress)")
+   || optimize_function_for_size_p (cfun)")
]
(symbol_ref "true")))
])
Index: lra.c
===
--- lra.c   (revision 215358)
+++ lra.c   (working copy)
@@ -2135,11 +2135,6 @@ lra (FILE *f)
 
   lra_in_progress = 1;
 
-  /* The enable attributes can change their values as LRA starts
- although it is a bad practice.  To prevent reuse of the outdated
- values, clear them.  */
-  recog_init ();
-
   lra_live_range_iter = lra_coalesce_iter = 0;
   lra_constraint_iter = lra_constraint_iter_after_spill = 0;
   lra_inheritance_iter = lra_undo_inheritance_iter = 0;
Index: recog.c
===
--- recog.c (revision 215337)
+++ recog.c (working copy)
@@ -2639,7 +2639,10 @@ constrain_operands (int strict)
   || (strict < 0 && CONSTANT_P (op))
   /* During reload, accept a pseudo  */
   || (reload_in_progress && REG_P (op)
-  && REGNO (op) >= FIRST_PSEUDO_REGISTER)))
+  && REGNO (op) >= FIRST_PSEUDO_REGISTER)
+  /* LRA can put reg value into memory if
+ it is necessary.  */
+  || (strict <= 0 && targetm.lra_p () && REG_P 
(op
win = 1;
  else if (insn_extra_address_constraint (cn)
   /* Every address operand can be reloaded to fit.  */


Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Jan Hubicka
Hi,
this is on ipa-icf-gimple.c

@@ -2827,11 +2829,19 @@ cgraph_node::verify_node (void)
{
  if (verify_edge_corresponds_to_fndecl (e, decl))
{
- error ("edge points to wrong declaration:");
- debug_tree (e->callee->decl);
- fprintf (stderr," Instead of:");
- debug_tree (decl);
- error_found = true;
+ /* The edge can be redirected in WPA by IPA 
ICF.
+Following check really ensures that it's
+not the case.  */
+
+ cgraph_node *current_node = cgraph_node::get 
(decl);
+ if (!current_node || 
!current_node->icf_merged)

I would move this into verify_edge_corresponds_to_fndecl.

diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
new file mode 100644
index 000..7031eaa
--- /dev/null
+++ b/gcc/ipa-icf-gimple.c
@@ -0,0 +1,384 @@
+/* Interprocedural Identical Code Folding pass
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   Contributed by Jan Hubicka  and Martin Liska 

+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */

Please add toplevel comment about what the code does and how to use it.

+namespace ipa_icf {
+
+/* Basic block equivalence comparison function that returns true if
+   basic blocks BB1 and BB2 (from functions FUNC1 and FUNC2) correspond.  */
... to each other?
I would add short comment that as comparsion goes you build voclabulary
of equivalences of variables/ssanames etc.
So people reading the code do not get lost at very beggining.

+
+bool
+func_checker::compare_bb (sem_bb *bb1, sem_bb *bb2)
+{
+  unsigned i;
+  gimple_stmt_iterator gsi1, gsi2;
+  gimple s1, s2;
+
+  if (bb1->nondbg_stmt_count != bb2->nondbg_stmt_count
+  || bb1->edge_count != bb2->edge_count)
+return RETURN_FALSE ();

The UPPERCASE looks ugly.  I see that RETURN_FALSE is a warpper for 
return_false_with_msg
that outputs line and file information.

I would make it lowercase even if it is macro. You may consider using
CXX_MEM_STAT_INFO style default argument to avoid function macro completely.
Probably not big win given that it won't save you from preprocesor mess.
+
+  gsi1 = gsi_start_bb (bb1->bb);
+  gsi2 = gsi_start_bb (bb2->bb);
+
+  for (i = 0; i < bb1->nondbg_stmt_count; i++)
+{
+  if (is_gimple_debug (gsi_stmt (gsi1)))
+   gsi_next_nondebug (&gsi1);
+
+  if (is_gimple_debug (gsi_stmt (gsi2)))
+   gsi_next_nondebug (&gsi2);
+
+  s1 = gsi_stmt (gsi1);
+  s2 = gsi_stmt (gsi2);
+
+  if (gimple_code (s1) != gimple_code (s2))
+   return RETURN_FALSE_WITH_MSG ("gimple codes are different");

I think you need to compare EH here.  Consider case where one unit
is compiled with -fno-exception and thus all EH regions are removed,
while other function has EH regions in it.  Those are not equivalent.

EH region is obtained by lookup_stmt_eh and then you need to comapre
them for match as you do with gimple_resx_regoin.

+  t1 = gimple_call_fndecl (s1);
+  t2 = gimple_call_fndecl (s2);
+
+  /* Function pointer variables are not supported yet.  */

They seems to be, compare_operand seems just right.

+
+/* Verifies for given GIMPLEs S1 and S2 that
+   label statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_label (gimple g1, gimple g2)
+{
+  if (m_ignore_labels)
+return true;
+
+  tree t1 = gimple_label_label (g1);
+  tree t2 = gimple_label_label (g2);
+
+  return compare_tree_ssa_label (t1, t2);
+}

I would expect the main BB loop to record BB in which label belongs to
and the BB assciatio neing checked here.
Otherwise I do not see how switch statements are compared to not have
different permutations of targets. Also note that one BB may have
multiple labels in them and they are equivalent.

Also I would punt on occurence of FORCED_LABEL. Those are tricky as they
may be passed around and compared for address and no one really defines
what should happen.  Better to avoid those.

+
+/* Verifies for given GIMPLEs S1 and S2 that
+   switch statements are semantically equivalent.  */
+
+bool
+func_checker::compare_gimple_switch (gimple g1, gimple g2)
+{
+

Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2014-09-26 Thread Jeff Law

On 09/26/14 07:57, Felix Yang wrote:

Hi Jeff,

 Thanks for the suggestions. I updated the patch accordingly.

 1. Both my employer(Huawei) and I have signed the copyright
assignments with FSF.
 These assignments are already sent via post two days ago and
hopefully should reach FSF in one week.
 Maybe it's OK to commit this patch now?
Not really.  It needs to be accepted by the FSF before we can include 
the work.




  2. I am not turning member loop_depth of struct equivalence into
short integer as GCC API such as bb_loop_depth
  returns a loop's depth as a 32-bit interger.
There's already other places that assume loops don't nest that deep. 
Please go ahead and change it.  And no need to explicitly mark the 
unused bits.  That's just a maintenance nightmare in the long term 
anyway :-)





  3. I find it's kind of difficult to use the new type and
interfaces for list walking the init_insns list for this patch.
 The type of init_insns list is rtx, not rtl_insn_list *. Seems
we need to change a lot in order to use the new interface.
 Not clear about the reason why it is not adjusted when we are
transferring to the new interface.
 Anyway, I think it's better to have another patch fix that issue. OK?
The right way to go is to add a checked cast when we have some code that 
is using the old interface and other code using the new interface.  It's 
actually a pretty easy change.


The checked casts effectively mark the limits of where we've been able 
to push the RTL typesafety work.  Long term as we push the typesafety 
work further into the compiler many/most of the checked casts will go away.


Unfortunately, that won't work in this case because other code wants to 
store a (const0_rtx) into the insn list.  (const0_rtx) isn't an INSN, so 
the checked cast fails and we get a nice abort/ICE.


Conceptually we just need another marker that is an INSN and we might as 
well just convert the whole file to use the new interface at that point.


Consider the request pulled.

The const0-rtx problem may be why this wasn't converted in the first 
palce.  Or it may simply have been a time problem.  David's done > 250 
patches around RTL typesafety, but he also has other work to be doing ;-)




  4. This bug is only reproduceable with my local customized GCC
version. So I don't have a testcase then.

OK.

I'll do a final review when I get notice about the copyright assignment 
from the FSF.


jeff



Re: [Patch, AArch64] Enable Address sanitizer and UB sanitizer

2014-09-26 Thread Andreas Schwab
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34c65c4

* sanitizer_common/sanitizer_platform_limits_posix.h
(__sanitizer___kernel_old_uid_t, __sanitizer___kernel_old_gid_t)
[__aarch64__]: Define to unsigned short.

---
 libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index caa36a4..139fe0a 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -470,7 +470,7 @@ namespace __sanitizer {
   typedef long __sanitizer___kernel_off_t;
 #endif
 
-#if defined(__powerpc__) || defined(__aarch64__) || defined(__mips__)
+#if defined(__powerpc__) || defined(__mips__)
   typedef unsigned int __sanitizer___kernel_old_uid_t;
   typedef unsigned int __sanitizer___kernel_old_gid_t;
 #else
-- 
2.1.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[PATCH] Fix finding default baseline symbols directory

2014-09-26 Thread Andreas Schwab
Tested on aarch64-suse-linux, where try_cpu=generic.

Andreas.

* configure.host: Use host_cpu, not try_cpu, to define default
abi_baseline_pair.
---
 libstdc++-v3/configure.host | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index a12871a..abfd609 100644
--- a/libstdc++-v3/configure.host
+++ b/libstdc++-v3/configure.host
@@ -346,8 +346,8 @@ case "${host}" in
 abi_baseline_pair=x86_64-linux-gnu
 ;;
   *)
-if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then
-  abi_baseline_pair=${try_cpu}-linux-gnu
+if test -d ${glibcxx_srcdir}/config/abi/post/${host_cpu}-linux-gnu; 
then
+  abi_baseline_pair=${host_cpu}-linux-gnu
 fi
 esac
 case "${host}" in
-- 
2.1.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [jit] Avoiding hardcoding "gcc"; supporting accelerators?

2014-09-26 Thread Joseph S. Myers
On Thu, 25 Sep 2014, David Malcolm wrote:

> Should this have the $(exeext) suffix seen in Makefile.in?
>   $(target_noncanonical)-gcc-$(version)$(exeext)

Depends on whether that's needed for the pex code to find it.

> As for (B), would it make sense to "bake in" the path to the binary into
> the pex invocation, and hence to turn off PEX_SEARCH?  If so, presumably
> I need to somehow expand the Makefile's value of $(bindir) into
> internal-api.c, right?  (I tried this in configure.ac, but merely got
> "$(exec_prefix)/bin" iirc).

An installation must be relocatable.  Thus, you can't just hardcode 
looking in the configured prefix; you'd need to locate it relative to 
libgccjit.so in some way (i.e. using make_relative_prefix, but I don't 
know offhand how libgccjit.so would locate itself).

> A better long-term approach to this would be to extract the spec
> machinery from gcc.c (perhaps into a "libdriver.a"?) and run it directly
> from the jit library - but that's a rather involved patch, I suspect.

And you'd still need libgccjit.so to locate itself for proper 
relocatability in finding other pieces such as assembler and linker.

> I wonder if the appropriate approach here is to have a single library
> with multiple plugin backends e.g. one for the CPU, one for each GPU
> family, with the ability to load multiple "backends" at once.

If you can get that working, sure.

> Unfortunately, "backend" is horribly overloaded here - I mean basically
> all of gcc here, everything other than the libgccjit.h API seen by
> client code.

(Though preferably as much as possible could be shared, i.e. properly 
define the parts of GCC that need building separately for each target and 
limit them as much as possible.  Joern's multi-target patches from 2010 
that selectively built parts of GCC using namespaces while sharing others 
without an obvious clear separation seemed very fragile.  For something 
robust you either build everything separately for each target, or have a 
well-defined separation between bits needing building separately and bits 
that can be built once and ways to avoid non-obvious target dependencies 
in bits built once.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-26 Thread Joseph S. Myers
On Fri, 26 Sep 2014, Ilya Verbin wrote:

> 2014-09-26  Bernd Schmidt  
>   Thomas Schwinge  
>   Ilya Verbin  
>   Andrey Turetskiy  
> 
>   * configure: Regenerate.
>   * configure.ac (--enable-as-accelerator-for)
>   (--enable-offload-targets): New configure options.
> gcc/
>   * Makefile.in (real_target_noncanonical, accel_dir_suffix)
>   (enable_as_accelerator): New variables substituted by configure.
>   (libsubdir, libexecsubdir, unlibsubdir): Tweak for the possibility of
>   being configured as an offload compiler.
>   (DRIVER_DEFINES): Pass new defines DEFAULT_REAL_TARGET_MACHINE and
>   ACCEL_DIR_SUFFIX.
>   (install-cpp, install-common, install_driver, install-gcc-ar): Do not
>   install for the offload compiler.
>   * config.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac (real_target_noncanonical, accel_dir_suffix)
>   (enable_as_accelerator, enable_offload_targets): Compute new variables.
>   (--enable-as-accelerator-for, --enable-offload-targets): New options.
>   (ACCEL_COMPILER): Define if the compiler is built as the accel compiler.
>   (OFFLOAD_TARGETS): List of target names suitable for offloading.
>   (ENABLE_OFFLOADING): Define if list of offload targets is not empty.

Any patch adding new configure options needs to add documentation of the 
semantics of those options in install.texi.  I see no such documentation 
in this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Jan Hubicka
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> new file mode 100644
> index 000..f3472fe
> --- /dev/null
> +++ b/gcc/ipa-icf.c
> @@ -0,0 +1,2841 @@
> +/* Interprocedural Identical Code Folding pass
> +   Copyright (C) 2014 Free Software Foundation, Inc.
> +
> +   Contributed by Jan Hubicka  and Martin Liska 
> 
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* Interprocedural Identical Code Folding for functions and
> +   read-only variables.
> +
> +   The goal of this transformation is to discover functions and read-only
> +   variables which do have exactly the same semantics.
(or value)
> +
> +   In case of functions,
> +   we could either create a virtual clone or do a simple function wrapper
> +   that will call equivalent function. If the function is just locally 
> visible,
> +   all function calls can be redirected. For read-only variables, we create
> +   aliases if possible.
> +
> +   Optimization pass arranges as follows:

The optimization pass is arranged as follows: (I guess)

I also wonder if the gimple equality code should be in ipa_icf namespace, it is 
intended
to be shared with tail merging pass, so what about just calling it 
gimple_sem_equality?

> +/* Verification function for edges E1 and E2.  */
> +
> +bool
> +func_checker::compare_edge (edge e1, edge e2)
> +{
> +  if (e1->flags != e2->flags)
> +return false;

In future we may want to experiment with checking that edge probabilities with
profile feedback match and refuse to merge BBs with different outgoing 
probabilities
(i.e. +-5%).
Just add it as TODO there, please.
> +
> +/* Return true if types are compatible from perspective of ICF.  */
> +bool func_checker::types_are_compatible_p (tree t1, tree t2,

Perhaps dropping _are_ would make sense, so we do not have two names
for essentially same thing.
> +bool compare_polymorphic,
> +bool first_argument)
> +{
> +  if (TREE_CODE (t1) != TREE_CODE (t2))
> +return RETURN_FALSE_WITH_MSG ("different tree types");
> +
> +  if (!types_compatible_p (t1, t2))
> +return RETURN_FALSE_WITH_MSG ("types are not compatible");
> +
> +  if (get_alias_set (t1) != get_alias_set (t2))
> +return RETURN_FALSE_WITH_MSG ("alias sets are different");

You do not need to compare alias sets except for memory operations IMO.
> +
> +  /* We call contains_polymorphic_type_p with this pointer type.  */
> +  if (first_argument && TREE_CODE (t1) == POINTER_TYPE)
> +{
> +  t1 = TREE_TYPE (t1);
> +  t2 = TREE_TYPE (t2);
> +}
> +
> +  if (compare_polymorphic
> +  && (contains_polymorphic_type_p (t1) || contains_polymorphic_type_p 
> (t2)))
> +{
> +  if (!contains_polymorphic_type_p (t1) || !contains_polymorphic_type_p 
> (t2))
> + return RETURN_FALSE_WITH_MSG ("one type is not polymorphic");
> +
> +  if (TYPE_MAIN_VARIANT (t1) != TYPE_MAIN_VARIANT (t2))
> + return RETURN_FALSE_WITH_MSG ("type variants are different for "
> +   "polymorphic type");

I added types_must_be_same_for_odr (t1,t2) for you here.
> +/* Fast equality function based on knowledge known in WPA.  */
> +
> +bool
> +sem_function::equals_wpa (sem_item *item)
> +{
> +  gcc_assert (item->type == FUNC);
> +
> +  m_compared_func = static_cast (item);
> +
> +  if (arg_types.length () != m_compared_func->arg_types.length ())
> +return RETURN_FALSE_WITH_MSG ("different number of arguments");
> +
> +  /* Checking types of arguments.  */
> +  for (unsigned i = 0; i < arg_types.length (); i++)
> +{
> +  /* This guard is here for function pointer with attributes 
> (pr59927.c).  */
> +  if (!arg_types[i] || !m_compared_func->arg_types[i])
> + return RETURN_FALSE_WITH_MSG ("NULL argument type");
> +
> +  if (!func_checker::types_are_compatible_p (arg_types[i],
> +   m_compared_func->arg_types[i],
> +   true, i == 0))
> + return RETURN_FALSE_WITH_MSG ("argument type is different");
> +}
> +
> +  /* Result type checking.  */
> +  if (!func_checker::types_are_compatible_p (result_type,
> +  m_compared_func->result_type))
> +return RETURN_FALSE_WITH_MSG ("result types are different");

You may want to compare ECF flags, such as nothrow/const/pure.  We do not
want to merge non-pure function into pure as it may not be pure in the context
it is used.

Do you compare attributes? I think optimize attribute ne

[Patch, MIPS] Cleanup mips header files.

2014-09-26 Thread Steve Ellcey

I would like to do some cleanup on the mips configuration code, both to
reduce the amount of duplicated code and to add better support for --with-arch,
--with-endian, and --with-abi.  As the first step in this work I would like
to check in this patch that removes the linux64.h and gnu-user64.h header
files and copies the needed pieces to linux.h and gnu-user.h.

Right now these headers are used when building a mips64* target or if you
use --enable-target=all, but there is no reason they can't be used for normal
32 bit mips targets too.  Then the only thing has to be done differently for
mips64* targets (or --enable-target=all) vs. mips 32 bit targets is to add
the multilib makefile fragment (t-linux64).

Most of the changes here are just moving macros from one file to another.
The only real functional changes are with GNU_USER_TARGET_LINK_SPEC and
LINUX_DRIVER_SELF_SPECS where we pass more explicit options to the linker
and now have a single consistent definition of these macros for all mips
targets instead of different ones for mips32 and mips64.

I built multiple different mips*-*-linux-gnu targets to test this change
but there are a lot of combinations and I couldn't build all of them.

OK for checkin?

Steve Ellcey
sell...@mips.com



2014-09-26  Steve Ellcey  

* config/mips/linux64.h: Remove.
* config/mips/gnu-user64.h: Remove.
* gcc.config (mips*-*-*): Remove references to linux64.h and
gnu-user64.h
* config/mips/gnu-user.h (GNU_USER_TARGET_LINK_SPEC): Replace
with modified version from gnu-user64.h.
(LINUX_DRIVER_SELF_SPECS): Update parts from gnu-user64.h.
(LOCAL_LABEL_PREFIX): Copy from gnu-user64.h.
* config/mips/linux.h (GNU_USER_LINK_EMULATION32): Copy from
linux64.h.
(GNU_USER_LINK_EMULATION64): Ditto.
(GNU_USER_LINK_EMULATIONN32): Ditto.
(GLIBC_DYNAMIC_LINKER32): Ditto.
(GLIBC_DYNAMIC_LINKER64): Ditto.
(GLIBC_DYNAMIC_LINKERN32): Ditto.
(UCLIBC_DYNAMIC_LINKER32): Ditto.
(UCLIBC_DYNAMIC_LINKER64): Ditto.
(UCLIBC_DYNAMIC_LINKERN32): Ditto.
(BIONIC_DYNAMIC_LINKERN32): Ditto.
(GNU_USER_DYNAMIC_LINKERN32): Ditto.
(GLIBC_DYNAMIC_LINKER): Delete.
(UCLIBC_DYNAMIC_LINKER): Delete.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index 0e50e9a..ce06656 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1941,7 +1941,7 @@ mips*-*-netbsd*)  # NetBSD/mips, either 
endian.
extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
;;
 mips*-mti-linux*)
-   tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h 
mips/linux-common.h mips/mti-linux.h"
+   tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
glibc-stdint.h ${tm_file} mips/gnu-user.h mips/linux.h mips/linux-common.h 
mips/mti-linux.h"
extra_options="${extra_options} linux-android.opt"
tmake_file="${tmake_file} mips/t-mti-linux"
tm_defines="${tm_defines} MIPS_ISA_DEFAULT=33 MIPS_ABI_DEFAULT=ABI_32"
@@ -1949,7 +1949,7 @@ mips*-mti-linux*)
gas=yes
;;
 mips64*-*-linux* | mipsisa64*-*-linux*)
-   tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h 
mips/linux-common.h"
+   tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
glibc-stdint.h ${tm_file} mips/gnu-user.h mips/linux.h mips/linux-common.h"
extra_options="${extra_options} linux-android.opt"
tmake_file="${tmake_file} mips/t-linux64"
tm_defines="${tm_defines} MIPS_ABI_DEFAULT=ABI_N32"
@@ -1973,7 +1973,6 @@ mips*-*-linux*)   # Linux MIPS, 
either endian.
tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
glibc-stdint.h ${tm_file} mips/gnu-user.h mips/linux.h"
extra_options="${extra_options} linux-android.opt"
if test x$enable_targets = xall; then
-   tm_file="${tm_file} mips/gnu-user64.h mips/linux64.h"
tmake_file="${tmake_file} mips/t-linux64"
fi
tm_file="${tm_file} mips/linux-common.h"
diff --git a/gcc/config/mips/gnu-user.h b/gcc/config/mips/gnu-user.h
index 638d7f0..3bb2248 100644
--- a/gcc/config/mips/gnu-user.h
+++ b/gcc/config/mips/gnu-user.h
@@ -52,16 +52,20 @@ along with GCC; see the file COPYING3.  If not see
 #undef MIPS_DEFAULT_GVALUE
 #define MIPS_DEFAULT_GVALUE 0
 
-/* Borrowed from sparc/linux.h */
 #undef GNU_USER_TARGET_LINK_SPEC
-#define GNU_USER_TARGET_LINK_SPEC \
- "%(endian_spec) \
-  %{shared:-shared} \
+#define GNU_USER_TARGET_LINK_SPEC "\
+%{G*} %{EB} %{EL} %{!EB:%{!EL:%(endian_spec)}} %{mips*} \
+%{shared} \
   %{!shared: \
 %{!static: \
   %{rdynamic:-export-dynamic} \
-  -dynamic-linker " GNU_USER_DYNAMIC_LINKER "} \
-  %{static:-static}}"
+  %{mabi=n32: -dynamic-linker " GNU_USER_D

[fortran,patch] Emit code for some IEEE functions in the front-end

2014-09-26 Thread FX
Hi all,

The attached patch improves our code generation for some of the IEEE_ARITHMETIC 
functions: some testing functions (is*) and some arithmetic (logb, rint, scalb, 
…). The interfaces are still present in the module file generated as part of 
the library (which allows, in particular, for accurate testing of the extent of 
support we have), but we catch them before emitting the actual function call 
and emit front-end-generated code instead. This code uses some intrinsics that 
we didn’t use in the front-end so far (some type generic, some not), so I have 
added them (logb, remainder, rint, signbit).

The patch is nice because it improves the quality of the code generated, 
eliminating in many cases the need for a function call. It is also a 
prerequisite to extend our IEEE support to more floating-point types (extended 
precision and binary128, on some targets including i386/x86_64). Without it, we 
would have a combinatorial explosion of the number of “helper” functions in the 
library.

Also, I’m removing symbols from gfortran.map, but no branching/release has 
occurred since I added them in the first place: it should be all good.

Regtested on x86_64-apple-darwin14. This regresses ieee_2.f90, at -m32 
-fno-float-store only, where we seem to trigger a missimplification of 
__builtin_rint(). I’ll send, just after this one, a mail to gcc to get some 
help on that, and track the issue separately.

OK to commit?
FX




ieee.ChangeLog
Description: Binary data


ieee.diff
Description: Binary data


Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Jan Hubicka
> While a plain Firefox -flto build works fine. LTO/PGO build fails with:
> 
> lto1: internal compiler error: in ipa_merge_profiles, at ipa-utils.c:540
> 0x7d6165 ipa_merge_profiles(cgraph_node*, cgraph_node*)
> ../../gcc/gcc/ipa-utils.c:540
> 0xf10c41 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
> ../../gcc/gcc/ipa-icf.c:753
> 0xf15206 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
> ../../gcc/gcc/ipa-icf.c:2706
> 0xf1c1f4 ipa_icf::sem_item_optimizer::execute()
> ../../gcc/gcc/ipa-icf.c:2098
> 0xf1d3f1 ipa_icf_driver
> ../../gcc/gcc/ipa-icf.c:2784
> 0xf1d3f1 ipa_icf::pass_ipa_icf::execute(function*)
> ../../gcc/gcc/ipa-icf.c:2831
> 
> 
> The pass is also very memory hungry (from 3GB without ICF to 4GB during
> libxul link), while the code size savings are in the 1% range.

Thnks for checking. I was just thinking about doing that myself.  Would
you mind posting -ftime-report of firefox WPA stage?

It seems that in this case we reject too many of equality candidates?
It think the original numbers was about 4-5% but later some equivalences was
disabled because of devirt/aliasing issues. Do you compare it with gold ICF
enabled? There are quite few obvious improvements to the analysis that can
be done, but I guess we need to analyze the interesting cases one by one.

One thing that Martin can try is to hook into lto-symtab and try to check
that the COMDAT functions that are known to be same pass the equality check.
I suppose we will learn interesting things this way.

I think the patch adds quite important infrastructure for gimple semantic
equality checking and function merging. I went through the majority of code and
I think it is mostly ready to mainline (i.e. cleaner than what we have in
tree-ssa-tailmerge) so hope we can finish the review process next week.
We will need to get better cost/benefits ratio to enable it for -O2 that is
someting I would really like to see for 5.0, but it seems to be easier to
handle this incrementally

Honza


Move tail merging pass forward

2014-09-26 Thread Jan Hubicka
Hi,
testcase in PR35545 shows case where profile feedback infrastructure does 
everything
to make the testcase optimized (fully devirutalized) but it does not happen
because tracer is run too late in the queue.

Tail duplication in general is a pass enabling more optimizations to be done
by forward propagating passes, such as constant propagation, fre or vrp.  It
makes no sense to run it afterwards.

This patch moves it shortly after loop optimization (as like loop optimization
it increases basic blocks). It may make sense to move it even further, but that
would probalby need more benchamrking.

I have profiledbootstrapped the patch with LTO at x86_64-linux and also
tested with firefox. Comitted.

Honza

PR middle-end/35545
* passes.def (pass_tracer): Move before last dominator pass.
* g++.dg/tree-prof/pr35545.C: New testcase.
Index: passes.def
===
--- passes.def  (revision 215645)
+++ passes.def  (working copy)
@@ -252,6 +252,7 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_strength_reduction);
+  NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dominator);
   NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_vrp);
@@ -262,7 +263,6 @@ along with GCC; see the file COPYING3.
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_cd_dce);
-  NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
Index: testsuite/g++.dg/tree-prof/pr35545.C
===
--- testsuite/g++.dg/tree-prof/pr35545.C(revision 0)
+++ testsuite/g++.dg/tree-prof/pr35545.C(revision 0)
@@ -0,0 +1,52 @@
+// devirt.cc
+/* { dg-options "-O2 -fdump-ipa-profile_estimate -fdump-tree-optimized" } */
+
+class A {
+public:
+  virtual int foo() {
+ return 1;
+  }
+
+int i;
+};
+
+class B : public A
+{
+public:
+  virtual int foo() {
+ return 2;
+  }
+
+ int b;
+} ;
+
+
+int main()
+{
+ int i;
+
+  A* ap = 0;
+
+  for (i = 0; i < 1; i++)
+  {
+
+ if (i%7==0)
+ {
+ap = new A();
+ }
+ else
+ap = new B();
+
+ap->foo();
+
+delete ap;
+
+  }
+
+  return 0;
+
+}
+/* { dg-final-use { scan-ipa-dump "Indirect call -> direct call" 
"profile_estimate" } } */
+/* { dg-final-use { cleanup-ipa-dump "profile" } } */
+/* { dg-final-use { scan-ipa-dump-not "OBJ_TYPE_REF" "optimized" } } */
+/* { dg-final-use { cleanup-tree-dump "optimized" } } */


Fix pasto in ipa_polymorphic_call_context::restrict_to_inner_class

2014-09-26 Thread Jan Hubicka
Hi,
this patch fixes ICE seen with testcase for PR62121 that is pasto in size 
checking.
The GCC 4.9 issue is different and fixed in meantime.  I will backport that 
change.

Bootstrapped/regtested x86_64-linux, comitted.

PR ipa/62121
* ipa-polymorphic-call.c 
(ipa_polymorphic_call_context::restrict_to_inner_class):
fix pasto in checking array size.

* g++.dg/torture/pr62121.C: New testcase.
Index: ipa-polymorphic-call.c
===
--- ipa-polymorphic-call.c  (revision 215645)
+++ ipa-polymorphic-call.c  (working copy)
@@ -327,7 +327,7 @@ ipa_polymorphic_call_context::restrict_t
  && (cur_offset
  + (expected_type ? tree_to_uhwi (TYPE_SIZE (expected_type))
 : 0)
- > tree_to_uhwi (TYPE_SIZE (type
+ > tree_to_uhwi (TYPE_SIZE (subtype
goto no_useful_type_info;
 
  cur_offset = new_offset;
Index: testsuite/g++.dg/torture/pr62121.C
===
--- testsuite/g++.dg/torture/pr62121.C  (revision 0)
+++ testsuite/g++.dg/torture/pr62121.C  (revision 0)
@@ -0,0 +1,12 @@
+// { dg-do compile }
+class A
+{
+  virtual double operator()();
+};
+class B : A
+{
+public:
+  double operator()();
+};
+extern B a[];
+int b = a[0]();


Re: [PATCH 3/5] IPA ICF pass

2014-09-26 Thread Markus Trippelsdorf
On 2014.09.27 at 01:27 +0200, Jan Hubicka wrote:
> > While a plain Firefox -flto build works fine. LTO/PGO build fails with:
> > 
> > lto1: internal compiler error: in ipa_merge_profiles, at ipa-utils.c:540
> > 0x7d6165 ipa_merge_profiles(cgraph_node*, cgraph_node*)
> > ../../gcc/gcc/ipa-utils.c:540
> > 0xf10c41 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
> > ../../gcc/gcc/ipa-icf.c:753
> > 0xf15206 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
> > ../../gcc/gcc/ipa-icf.c:2706
> > 0xf1c1f4 ipa_icf::sem_item_optimizer::execute()
> > ../../gcc/gcc/ipa-icf.c:2098
> > 0xf1d3f1 ipa_icf_driver
> > ../../gcc/gcc/ipa-icf.c:2784
> > 0xf1d3f1 ipa_icf::pass_ipa_icf::execute(function*)
> > ../../gcc/gcc/ipa-icf.c:2831
> > 
> > 
> > The pass is also very memory hungry (from 3GB without ICF to 4GB during
> > libxul link), while the code size savings are in the 1% range.
> 
> Thnks for checking. I was just thinking about doing that myself.  Would
> you mind posting -ftime-report of firefox WPA stage?

(without ICF)
Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
  1412 kB ( 0%) ggc
 phase opt and generate  :  58.38 (63%) usr   2.00 (47%) sys  60.37 (40%) wall  
403069 kB (12%) ggc
 phase stream in :  30.24 (33%) usr   0.97 (23%) sys  33.90 (22%) wall 
2944210 kB (88%) ggc
 phase stream out:   4.29 ( 5%) usr   1.32 (31%) sys  57.32 (38%) wall  
 0 kB ( 0%) ggc
 phase finalize  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall  
 0 kB ( 0%) ggc
 garbage collection  :   3.68 ( 4%) usr   0.00 ( 0%) sys   3.68 ( 2%) wall  
 0 kB ( 0%) ggc
 callgraph optimization  :   0.50 ( 1%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall  
   166 kB ( 0%) ggc
 ipa dead code removal   :   6.91 ( 7%) usr   0.08 ( 2%) sys   7.25 ( 5%) wall  
 0 kB ( 0%) ggc
 ipa virtual call target :   7.08 ( 8%) usr   0.04 ( 1%) sys   6.93 ( 5%) wall  
 0 kB ( 0%) ggc
 ipa devirtualization:   0.27 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall  
 10365 kB ( 0%) ggc
 ipa cp  :   1.81 ( 2%) usr   0.06 ( 1%) sys   3.40 ( 2%) wall  
173701 kB ( 5%) ggc
 ipa inlining heuristics :  16.60 (18%) usr   0.27 ( 6%) sys  17.48 (12%) wall  
532704 kB (16%) ggc
 ipa comdats :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall  
 0 kB ( 0%) ggc
 ipa lto gimple out  :   0.21 ( 0%) usr   0.04 ( 1%) sys   0.97 ( 1%) wall  
 0 kB ( 0%) ggc
 ipa lto decl in :  18.29 (20%) usr   0.54 (13%) sys  18.96 (12%) wall 
2226088 kB (66%) ggc
 ipa lto decl out:   3.93 ( 4%) usr   0.13 ( 3%) sys   4.06 ( 3%) wall  
 0 kB ( 0%) ggc
 ipa lto constructors in :   0.24 ( 0%) usr   0.03 ( 1%) sys   0.59 ( 0%) wall  
 14226 kB ( 0%) ggc
 ipa lto constructors out:   0.08 ( 0%) usr   0.04 ( 1%) sys   0.15 ( 0%) wall  
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   0.89 ( 1%) usr   0.12 ( 3%) sys   1.02 ( 1%) wall  
364151 kB (11%) ggc
 ipa lto decl merge  :   2.14 ( 2%) usr   0.01 ( 0%) sys   2.14 ( 1%) wall  
  8196 kB ( 0%) ggc
 ipa lto cgraph merge:   1.59 ( 2%) usr   0.00 ( 0%) sys   1.60 ( 1%) wall  
 12716 kB ( 0%) ggc
 whopr wpa   :   1.54 ( 2%) usr   0.03 ( 1%) sys   1.55 ( 1%) wall  
 1 kB ( 0%) ggc
 whopr wpa I/O   :   0.04 ( 0%) usr   1.11 (26%) sys  52.10 (34%) wall  
 0 kB ( 0%) ggc
 whopr partitioning  :   5.02 ( 5%) usr   0.01 ( 0%) sys   5.03 ( 3%) wall  
  4938 kB ( 0%) ggc
 ipa reference   :   2.04 ( 2%) usr   0.02 ( 0%) sys   2.08 ( 1%) wall  
 0 kB ( 0%) ggc
 ipa profile :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall  
 0 kB ( 0%) ggc
 ipa pure const  :   2.43 ( 3%) usr   0.02 ( 0%) sys   2.49 ( 2%) wall  
 0 kB ( 0%) ggc
 tree STMT verifier  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall  
 0 kB ( 0%) ggc
 callgraph verifier  :  16.31 (18%) usr   1.69 (39%) sys  17.96 (12%) wall  
 0 kB ( 0%) ggc
 dominance computation   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall  
 0 kB ( 0%) ggc
 varconst:   0.01 ( 0%) usr   0.03 ( 1%) sys   0.05 ( 0%) wall  
 0 kB ( 0%) ggc
 unaccounted todo:   0.69 ( 1%) usr   0.00 ( 0%) sys   0.69 ( 0%) wall  
 0 kB ( 0%) ggc
 TOTAL :  92.91 4.29   151.73
3348693 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

(with ICF)
Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
  1412 kB ( 0%) ggc
 phase opt and generate  :  82.70 (70%) usr   3.31 (53%) sys  86.17 (45%) wall 
1468975 kB (33%) ggc
 phase stream in :  30.46 (26%) usr   1.02 (16%) sys  31.48 (16%) wall 
2944210 kB (67%) ggc
 phase stream out:   4.52 ( 4%) usr   1.90 (30%) sys  73.47 (38%) wall  
12 kB ( 0%) ggc
 phase final