Re: Question about generated type for common block in fortran

2017-11-09 Thread Bin.Cheng
On Wed, Nov 8, 2017 at 3:22 PM, Richard Biener wrote: > On Wed, Nov 8, 2017 at 3:45 PM, Michael Matz wrote: >> Hi, >> >> On Wed, 8 Nov 2017, Richard Biener wrote: >> >>> Not sure how - the issue is the FIELD_DECLs overlap which rules out a >>> RECORD_TYPE and leaves us with a UNION_TYPE. >> >> No

Re: eliminate dead stores across functions

2018-03-06 Thread Bin.Cheng
On Tue, Mar 6, 2018 at 2:28 PM, Richard Biener wrote: > On Tue, Mar 6, 2018 at 1:00 PM, Prathamesh Kulkarni > wrote: >> Hi, >> For the following test-case, >> >> int a; >> >> __attribute__((noinline)) >> static void foo() >> { >> a = 3; >> } >> >> int main() >> { >> a = 4; >> foo (); >> r

Re: eliminate dead stores across functions

2018-03-06 Thread Bin.Cheng
On Tue, Mar 6, 2018 at 4:44 PM, Martin Jambor wrote: > Hi Bin, > > On Tue, Mar 06 2018, Bin Cheng wrote: >> On Tue, Mar 6, 2018 at 2:28 PM, Richard Biener >>> >>> Do you think the situation happens often enough to make this worthwhile? >> There is one probably more useful case. Program may use gl

Re: eliminate dead stores across functions

2018-03-06 Thread Bin.Cheng
On Tue, Mar 6, 2018 at 4:50 PM, Bin.Cheng wrote: > On Tue, Mar 6, 2018 at 4:44 PM, Martin Jambor wrote: >> Hi Bin, >> >> On Tue, Mar 06 2018, Bin Cheng wrote: >>> On Tue, Mar 6, 2018 at 2:28 PM, Richard Biener >>>> >>>> Do you think the

Re: How big (and fast) is going to be GCC 8? [part 2]

2018-03-06 Thread Bin.Cheng
On Tue, Mar 6, 2018 at 5:50 PM, Martin Liška wrote: > Hi. > > This is speed comparison of GCC 8 builds compared to my system GCC 7.3.0 > which is built with PGO bootstrap. > > I run empty C and C++ source file, tramp3d and the rest are some big beasts > from GCC source file. Feel free to suggest a

Re: Fortran array slices and -frepack-arrays

2018-04-13 Thread Bin.Cheng
On Fri, Apr 13, 2018 at 3:32 PM, Wilco Dijkstra wrote: > Hi, > > I looked at a few performance anomalies between gfortran and Flang - it > appears array slices > are treated differently. Using -frepack-arrays fixed a performance issue in > gfortran and didn't > cause any regressions. Making inpu

Re: Loop fusion.

2018-04-23 Thread Bin.Cheng
On Sun, Apr 22, 2018 at 3:27 PM, Toon Moene wrote: > A few days ago there was a rant on the Fortran Standardization Committee's > e-mail list about Fortran's "whole array arithmetic" being unoptimizable. > > An example picked at random from our weather forecasting code: > > ZQICE(1:NPROMA,1:NF

Re: Bin Cheng appointed Loop Optimizer co-maintainer

2018-05-22 Thread Bin.Cheng
On Mon, May 21, 2018 at 6:20 PM, David Edelsohn wrote: > I am pleased to announce that the GCC Steering Committee has > appointed Bin Cheng as Loop Optimizer co-maintainer. > > Please join me in congratulating Bin on his new role. > Bin, please update your listing in the MAINTAINER

Re: PR80155: Code hoisting and register pressure

2018-05-23 Thread Bin.Cheng
On Wed, May 23, 2018 at 9:28 AM, Richard Biener wrote: > On Wed, 23 May 2018, Prathamesh Kulkarni wrote: > >> Hi, >> I am trying to work on PR80155, which exposes a problem with code >> hoisting and register pressure on a leading embedded benchmark for ARM >> cortex-m7, where code-hoisting causes

Re: PR80155: Code hoisting and register pressure

2018-05-25 Thread Bin.Cheng
On Fri, May 25, 2018 at 10:23 AM, Prathamesh Kulkarni wrote: > On 23 May 2018 at 18:37, Jeff Law wrote: >> On 05/23/2018 03:20 AM, Prathamesh Kulkarni wrote: >>> On 23 May 2018 at 13:58, Richard Biener wrote: On Wed, 23 May 2018, Prathamesh Kulkarni wrote: > Hi, > I am trying t

Re: PR80155: Code hoisting and register pressure

2018-05-26 Thread Bin.Cheng
On Fri, May 25, 2018 at 5:54 PM, Richard Biener wrote: > On May 25, 2018 6:57:13 PM GMT+02:00, Jeff Law wrote: >>On 05/25/2018 03:49 AM, Bin.Cheng wrote: >>> On Fri, May 25, 2018 at 10:23 AM, Prathamesh Kulkarni >>> wrote: >>>> On 23 May 2018 at 18:37, Jef

Re: How to get GCC on par with ICC?

2018-06-06 Thread Bin.Cheng
On Wed, Jun 6, 2018 at 3:51 PM, Paul Menzel wrote: > Dear GCC folks, > > > Some scientists in our organization still want to use the Intel compiler, as > they say, it produces faster code, which is then executed on clusters. Some > resources on the Web [1][2] confirm this. (I am aware, that it’s h

prepare_decl_rtl in tree-ssa-loop-ivopts.c doesn't work with (ADDR_EXPR (ARRAY_REF))?

2013-04-28 Thread Bin.Cheng
Hi, Currently tree-ssa-loop-ivopts.c doesn't calculate addr_expr's cost, while uses target_spill_cost instead, as in function force_expr_to_var_cost. When I experimented with ivopts by calling computation_cost to calculate cost of ADDR_EXPR, I encountered assert failure in expand_expr_addr_expr_1,

Re: prepare_decl_rtl in tree-ssa-loop-ivopts.c doesn't work with (ADDR_EXPR (ARRAY_REF))?

2013-04-28 Thread Bin.Cheng
(DECL_MODE (obj), (*regno)++); It generates RTX_REG(regno) for an var_decl which is an array has DECL_MODE == OImode. Any suggestions? On Sun, Apr 28, 2013 at 3:15 PM, Bin.Cheng wrote: > Hi, > Currently tree-ssa-loop-ivopts.c doesn't calculate addr_expr's cost, > while use

Re: prepare_decl_rtl in tree-ssa-loop-ivopts.c doesn't work with (ADDR_EXPR (ARRAY_REF))?

2013-05-16 Thread Bin.Cheng
On Sun, Apr 28, 2013 at 8:15 PM, Richard Biener wrote: > "Bin.Cheng" wrote: > >>I suspect codes in prepare_decl_rtl: >> >>case VAR_DECL: >>case PARM_DECL: >>case RESULT_DECL: >> *ws = 0; >> obj = *expr_p; >&

Re: prepare_decl_rtl in tree-ssa-loop-ivopts.c doesn't work with (ADDR_EXPR (ARRAY_REF))?

2013-05-16 Thread Bin.Cheng
On Thu, May 16, 2013 at 7:21 PM, Bin.Cheng wrote: > On Sun, Apr 28, 2013 at 8:15 PM, Richard Biener > wrote: >> "Bin.Cheng" wrote: >> >>>I suspect codes in prepare_decl_rtl: >>> >>>case VAR_DECL: >>>case PARM_DECL: &

Question on operand_equal_p on different type conversion expressions

2013-05-20 Thread Bin.Cheng
Hi, I ran into a call of operand_equal_p for two type conversion tree nodes like: arg0: unit size align 16 symtab 0 alias set -1 canonical type 0xb74faae0 precision 16 min max > arg 0 unit size align 32 symtab 0 alias set 5 canonical type 0xb74fa42

Re: Question on operand_equal_p on different type conversion expressions

2013-05-20 Thread Bin.Cheng
On Tue, May 21, 2013 at 1:55 PM, Andrew Pinski wrote: > On Mon, May 20, 2013 at 10:50 PM, Bin.Cheng wrote: > > > NOP_EXPR here is a misnamed tree really. It could also be a > CONVERT_EXPR and still have the same issue as the types are not the > same. > > >>

Re: Question on operand_equal_p on different type conversion expressions

2013-05-21 Thread Bin.Cheng
On Tue, May 21, 2013 at 4:50 PM, Richard Biener wrote: > On Tue, May 21, 2013 at 8:38 AM, Bin.Cheng wrote: >> On Tue, May 21, 2013 at 1:55 PM, Andrew Pinski wrote: >>> On Mon, May 20, 2013 at 10:50 PM, Bin.Cheng wrote: >> >>> >>> >>> NOP_EXPR

[RFC]How to get more accurate cost of pre-loop calculations in ivopts pass

2013-05-31 Thread Bin.Cheng
Hi, During studying ivopt pass, I found the cost of preloop calculations are inaccurately calculated in many scenarios. There are two kinds of preloop calculations: base of candidates and invariant part of iv use representation. For base of iv candidates, it is calculated as below: cost_base = fo

fragile test case ivopt_infer_2.c

2013-06-19 Thread Bin.Cheng
Hi, For test case gcc.dg/tree-ssa/ivopt_inter_2.c #ifndef TYPE #define TYPE char* #endif extern char a[]; /* Can not infer loop iteration from array -- exit test can not be replaced. */ void foo (unsigned int i_width, TYPE dst) { unsigned long long i = 0; unsigned long long j = 0; for ( ;

Re: fragile test case ivopt_infer_2.c

2013-06-19 Thread Bin.Cheng
On Wed, Jun 19, 2013 at 4:43 PM, Richard Biener wrote: > On Wed, Jun 19, 2013 at 10:04 AM, Bin.Cheng wrote: >> Hi, >> For test case gcc.dg/tree-ssa/ivopt_inter_2.c >> >> #ifndef TYPE >> #define TYPE char* >> #endif >> >> extern char a[]; >&

Re: fragile test case ivopt_infer_2.c

2013-06-19 Thread Bin.Cheng
On Wed, Jun 19, 2013 at 8:33 PM, Richard Biener wrote: > On Wed, Jun 19, 2013 at 12:35 PM, Bin.Cheng wrote: >> On Wed, Jun 19, 2013 at 4:43 PM, Richard Biener >> wrote: >>> On Wed, Jun 19, 2013 at 10:04 AM, Bin.Cheng wrote: >>>> Hi, >>>&g

Question on register renaming in rtl loop unroll pass

2013-06-28 Thread Bin.Cheng
Hi, I have a question about register renaming in rtl loop unroll. For an example loop: .L1: [r162] <- x r162 <- r162 + 4 ... b .L1 After unrolling: .L1: [r162] <- x r197 <- r162 + 4 r162 <- r197 ... [r162] <- y r162 <- r197 + 4 ... b .L1 Why not: .L1: [r162] <- x r16

Re: Question on register renaming in rtl loop unroll pass

2013-06-28 Thread Bin.Cheng
On Fri, Jun 28, 2013 at 6:10 PM, Eric Botcazou wrote: >> Hi, I have a question about register renaming in rtl loop unroll. >> For an example loop: >> .L1: >> [r162] <- x >> r162 <- r162 + 4 >> ... >> b .L1 >> >> After unrolling: >> .L1: >> [r162] <- x >> r197 <- r162 + 4 >> r162 <-

Re: Question on register renaming in rtl loop unroll pass

2013-07-04 Thread Bin.Cheng
On Fri, Jun 28, 2013 at 6:39 PM, Eric Botcazou wrote: >> The problem is auto-inc-dec is weak and can only capture >> post-increment in first part of code, generating even worse code for >> RA: >> .L1: >> r197 <- r162 >> [r197++] <- x >> ... >> [r162+4] <- y >> r162 <- r197+0x4 >> ... >

Question on the fwprop pass

2013-07-12 Thread Bin.Cheng
Hi, I encountered below example, 79: r169:SI=r190:SI<<0x2 115: r180:SI=r180:SI+r169:SI 116: cc:CC=cmp(r181:SI,r190:SI) 117: pc={(cc:CC==0)?L125:pc} The register r169 is only defined by insn79, so I was hoping the reference in insn115 can be replaced by "r190<<0x2", thus saving one ins

libstdc++ test case ext/headers.cc failed on arm-none-eabi

2013-08-07 Thread Bin.Cheng
Hi, I spotted case ext/headers.cc failed on arm-none-eabi with below information: In file included from /home/build/work/gcc-build/arm-none-eabi/armv7-m/libstdc++-v3/include/arm-none-eabi/bits/gthr.h:148:0, from /home/build/work/gcc-build/arm-none-eabi/armv7-m/libstdc++-v3/include

Undefined behavior or gcc is doing additional good job?

2014-01-03 Thread Bin.Cheng
Hi, For below simple example: #include extern uint32_t __bss_start[]; extern uint32_t __data_start[]; void Reset_Handler(void) { /* Clear .bss section (initialize with zeros) */ for (uint32_t* bss_ptr = __bss_start; bss_ptr != __data_start; ++bss_ptr) { *bss_ptr = 0; } } One snapshot of ou

Re: Undefined behavior or gcc is doing additional good job?

2014-01-03 Thread Bin.Cheng
On Fri, Jan 3, 2014 at 4:24 PM, Jakub Jelinek wrote: > On Fri, Jan 03, 2014 at 04:12:19PM +0800, Bin.Cheng wrote: >> Hi, For below simple example: >> #include >> >> extern uint32_t __bss_start[]; >> extern uint32_t __data_start[]; >> >> void Reset

Re: Undefined behavior or gcc is doing additional good job?

2014-01-07 Thread Bin.Cheng
On Fri, Jan 3, 2014 at 5:17 PM, Jakub Jelinek wrote: > On Fri, Jan 03, 2014 at 04:44:48PM +0800, Bin.Cheng wrote: >> >> extern uint32_t __bss_start[]; >> >> extern uint32_t __data_start[]; >> >> >> >> void Reset_Handler(void) >>

Re: Undefined behavior or gcc is doing additional good job?

2014-01-07 Thread Bin.Cheng
On Tue, Jan 7, 2014 at 4:10 PM, Jakub Jelinek wrote: > On Tue, Jan 07, 2014 at 04:01:23PM +0800, Bin.Cheng wrote: >> Em, YES, it comes from ivopt rewriting, but, if it's not undefined >> behavior, won't it be annoying (or simply wrong) for compiler to do >> so

Possible enhancement for RTL data flow?

2014-01-07 Thread Bin.Cheng
Hi, I noticed function df_insn_rescan always deletes and re-computes insn_info if any one of defs/uses/eq_uses/mw is verified changed by df_insn_refs_verify, even in some passes (like fwprop), the defs are never changed. Could it be improved to only update the changed part (especially we have df_r

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-02-29 Thread Bin.Cheng
On Mon, Jan 2, 2012 at 10:54 PM, Richard Guenther wrote: > On Mon, Jan 2, 2012 at 3:09 PM, Amker.Cheng wrote: >> On Mon, Jan 2, 2012 at 9:37 PM, Richard Guenther >> wrote: >> >>> Well, with >>> >>> Index: gcc/tree-ssa-pre.c >>> ===

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-03-01 Thread Bin.Cheng
>> Second point, as you said, PRE often get confused and moves compare >> EXPR far from jump statement. Could we rely on register re-materialize >> to handle this, or any other solution? > > Well, a simple kind of solution would be to preprocess the IL before > redundancy elimination and separate t

why no shortcut operation for comparion on _Complex operands

2012-03-25 Thread Bin.Cheng
Hi, In tree-complex.c's function expand_complex_comparison, gcc just expand comparison on complex operands into comparisons on inner type, like: D.5375_17 = REALPART_EXPR ; D.5376_18 = IMAGPART_EXPR ; g2.1_5 = COMPLEX_EXPR ; D.5377_19 = REALPART_EXPR ; D.5378_20 = IMAGPART_EXPR ; g3.2_

Re: why no shortcut operation for comparion on _Complex operands

2012-03-26 Thread Bin.Cheng
On Mon, Mar 26, 2012 at 3:27 PM, Richard Guenther wrote: > On Sun, Mar 25, 2012 at 2:42 PM, Bin.Cheng wrote: >> Hi, >> In tree-complex.c's function expand_complex_comparison, gcc just >> expand comparison on complex >> operands into comparisons on in

Missed optimization in PRE?

2012-03-29 Thread Bin.Cheng
Hi, Following is the tree dump of 094t.pre for a test program. Question is loads of D.5375_12/D.5375_14 are redundant on path , but why not lowered into basic block 3, where it is used. BTW, seems no tree pass handles this case currently. Any idea? Thanks int z$imag; int z$real; int D.5378

Re: Missed optimization in PRE?

2012-03-29 Thread Bin.Cheng
On Thu, Mar 29, 2012 at 6:07 PM, Richard Guenther wrote: > On Thu, Mar 29, 2012 at 12:02 PM, Bin.Cheng wrote: >> Hi, >> Following is the tree dump of 094t.pre for a test program. >> Question is loads of D.5375_12/D.5375_14 are redundant on path > bb7, bb5, bb6>, >&g

Re: Missed optimization in PRE?

2012-03-29 Thread Bin.Cheng
On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther wrote: > On Thu, Mar 29, 2012 at 12:10 PM, Bin.Cheng wrote: >> On Thu, Mar 29, 2012 at 6:07 PM, Richard Guenther >> wrote: >>> On Thu, Mar 29, 2012 at 12:02 PM, Bin.Cheng wrote: >>>> Hi, >>>> F

Re: Missed optimization in PRE?

2012-03-29 Thread Bin.Cheng
On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther wrote: > On Thu, Mar 29, 2012 at 12:10 PM, Bin.Cheng wrote: >> On Thu, Mar 29, 2012 at 6:07 PM, Richard Guenther >> wrote: >>> On Thu, Mar 29, 2012 at 12:02 PM, Bin.Cheng wrote: >>>> Hi, >>>> F

Re: Missed optimization in PRE?

2012-03-30 Thread Bin.Cheng
On Fri, Mar 30, 2012 at 4:15 PM, Richard Guenther wrote: > On Thu, Mar 29, 2012 at 5:25 PM, Bin.Cheng wrote: >> On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther >> wrote: >>> On Thu, Mar 29, 2012 at 12:10 PM, Bin.Cheng wrote: >>>> On Thu, Mar 29, 2012 at 6

Bug in reload when forming inheritable reload information?

2012-04-03 Thread Bin.Cheng
Hi, Recently I found a test program got wrongly reloaded, as reported in PR52804. As the comment, I think reload_reg_reaches_end_p in reload1.c should handle case: The first reload type is RELOAD_FOR_INPADDR_ADDRESS; the second reload type is RELOAD_FOR_INPADDR_ADDRESS and the reload register is sa

Re: Missed optimization in PRE?

2012-04-08 Thread Bin.Cheng
On Fri, Mar 30, 2012 at 5:43 PM, Bin.Cheng wrote: > On Fri, Mar 30, 2012 at 4:15 PM, Richard Guenther > wrote: >> On Thu, Mar 29, 2012 at 5:25 PM, Bin.Cheng wrote: >>> On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther >>> wrote: >>>> On Thu, Mar 29, 20

Re: Missed optimization in PRE?

2012-04-10 Thread Bin.Cheng
On Mon, Apr 9, 2012 at 7:02 PM, Richard Guenther wrote: > On Mon, Apr 9, 2012 at 8:00 AM, Bin.Cheng wrote: >> On Fri, Mar 30, 2012 at 5:43 PM, Bin.Cheng wrote: >> >> Hi Richard, >> I am testing a patch to sink load of memory to proper basic block. >>

Re: Missed optimization in PRE?

2012-04-11 Thread Bin.Cheng
On Wed, Apr 11, 2012 at 11:28 AM, Bin.Cheng wrote: > On Mon, Apr 9, 2012 at 7:02 PM, Richard Guenther > wrote: >> On Mon, Apr 9, 2012 at 8:00 AM, Bin.Cheng wrote: >>> On Fri, Mar 30, 2012 at 5:43 PM, Bin.Cheng wrote: > >>> >>> Hi Richard, >>

Re: Missed optimization in PRE?

2012-04-11 Thread Bin.Cheng
On Wed, Apr 11, 2012 at 5:09 PM, Richard Guenther wrote: > On Wed, Apr 11, 2012 at 10:05 AM, Bin.Cheng wrote: >> On Wed, Apr 11, 2012 at 11:28 AM, Bin.Cheng wrote: >> >> Turns out if-conversion checks whether gimple statement traps or not. >> For the statement "

About sink load from memory in tree-ssa-sink.c

2012-04-17 Thread Bin.Cheng
Hi, As discussed at thread "http://gcc.gnu.org/ml/gcc/2012-04/msg00396.html";, I am trying a patch now. The problem here is I have to go through all basic block from "sink_from" to "sink_to" to check whether the memory might be clobbered in them. Currently I have two methods: 1, do fully data analy

Re: About sink load from memory in tree-ssa-sink.c

2012-04-20 Thread Bin.Cheng
On Wed, Apr 18, 2012 at 5:25 PM, Richard Guenther wrote: > On Wed, Apr 18, 2012 at 8:53 AM, Bin.Cheng wrote: > > I don't understand method 2.  I'd do > >  start at the single predecessor of the sink-to block > >  foreach stmt from the end to the beginning of t

Re: About sink load from memory in tree-ssa-sink.c

2012-04-20 Thread Bin.Cheng
On Fri, Apr 20, 2012 at 4:54 PM, Richard Guenther wrote: > On Fri, Apr 20, 2012 at 9:52 AM, Bin.Cheng wrote: >> On Wed, Apr 18, 2012 at 5:25 PM, Richard Guenther >> wrote: >>> On Wed, Apr 18, 2012 at 8:53 AM, Bin.Cheng wrote: >> >>> >>> I don&

Question on scan_one_insn in IRA about load parameters from stack slot.

2012-04-24 Thread Bin.Cheng
Hi, In scan_one_insn, gcc checks whether an insn loads a parameter from its stack slot, and then record the fact by decrease the memory cost. What I do not understand is the check condition like below checks the REG_EQUIV note, rather than checking memory access using stack pointer directly. if

Re: Question on scan_one_insn in IRA about load parameters from stack slot.

2012-04-26 Thread Bin.Cheng
On Wed, Apr 25, 2012 at 10:46 PM, Vladimir Makarov wrote: > On 04/24/2012 11:56 PM, Bin.Cheng wrote: >> >> Hi, >> In scan_one_insn, gcc checks whether an insn loads a parameter from >> its stack slot, and then >> record the fact by decrease the memory cost. >

conflicts between combine and pre global passes?

2012-04-27 Thread Bin.Cheng
Hi, I noticed that global passes before combine, like loop-invariant/cprop/cse2 some time have conflicts with combine. The combine pass can only operates with basic block, while these global passes move insns across basic block and left no description info. For example, a case I encountered. (ins

Re: conflicts between combine and pre global passes?

2012-04-28 Thread Bin.Cheng
On Sat, Apr 28, 2012 at 4:55 PM, Eric Botcazou wrote: >> I noticed that global passes before combine, like loop-invariant/cprop/cse2 >> some time have conflicts with combine. >> The combine pass can only operates with basic block, while these global >> passes move insns across basic block and left

Re: conflicts between combine and pre global passes?

2012-04-28 Thread Bin.Cheng
On Sat, Apr 28, 2012 at 5:15 PM, Eric Botcazou wrote: >> I am sorry for misleading description. By "propagate register 172 in >> insn79 and delete insn78" >> I was meaning that gcc replaces reg 172 in insn79 with another >> register contains ZERO and >> that register(saying reg X) is defined in ot

Re: conflicts between combine and pre global passes?

2012-04-28 Thread Bin.Cheng
On Sat, Apr 28, 2012 at 6:13 PM, Eric Botcazou wrote: >> Yes, the reason here should be the pattern for insn 79 has predicates on >> its operands and does not allow constant here. > > And there is no way to get rid of the 2 pluses and thus change the pattern? > > -- > Eric Botcazou The instructio

Re: Question on scan_one_insn in IRA about load parameters from stack slot.

2012-05-07 Thread Bin.Cheng
On Mon, May 7, 2012 at 10:20 PM, Vladimir Makarov wrote: > On 04/26/2012 04:49 AM, Bin.Cheng wrote: >> >> On Wed, Apr 25, 2012 at 10:46 PM, Vladimir Makarov >>  wrote: >>> >>> On 04/24/2012 11:56 PM, Bin.Cheng wrote: >>>> >>>>

Is it possible to make gcc detect whether printf prints floating point numbers?

2012-06-08 Thread Bin.Cheng
Hi all, In micro-controller applications, code size is critical and the size problem is worse if library is linked. For example, most c programs call printf to format output data, that means floating point code get linked even the program only want to output non-floating point numbers. Currently, w

builtin_strncat/builtin_strcat reads memory pointed to by the first argument?

2012-06-11 Thread Bin.Cheng
Hi, In "ref_maybe_used_by_call_p_1", the comment says "strcat/strncat additionally reads memory pointed to by the first argument." I do not understand these words well, why the first string is read by the two functions? Thanks for help. -- Best Regards.

Re: Is it possible to make gcc detect whether printf prints floating point numbers?

2012-06-11 Thread Bin.Cheng
On Tue, Jun 12, 2012 at 4:23 AM, Joseph S. Myers wrote: > On Fri, 8 Jun 2012, Bin.Cheng wrote: > >> For example, most c programs call printf to format output data, that >> means floating point code get linked even the program only want to >> output non-floating point num

regrename introduces dependencies and causes cprop_hardreg regressions

2012-07-11 Thread Bin.Cheng
Hi, I measured the impact of regrename on code size of benchmark CSiBE, using following command line: 1. arm-none-eabi-gcc -mthumb -mcpu=cortex-m0 -Os comparing to 2. arm-none-eabi-gcc -mthumb -mcpu=cortex-m0 -Os -frename-registers And I was surprised that regrename causes many code size regressi

RFC: extend cprop_hardreg into a global pass

2012-07-24 Thread Bin.Cheng
Hi, Currently GCC does hard register forward propagation in pass_cprop_hardreg to remove as many dependencies as possible and delete redundant copies, but this pass is limited in each basic block so cannot do any global propagation. While as a fact, GCC does generate redundant copies/loads crossing

Re: RFC: extend cprop_hardreg into a global pass

2012-07-24 Thread Bin.Cheng
On Wed, Jul 25, 2012 at 2:14 AM, Steven Bosscher wrote: > Bin Cheng wrote: >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 > > You could foster-parent and fix the attached patch to address this issue. > (I'm not interested in pursuing this further myself.) Thanks for your comments. I haven't

Re: RFC: extend cprop_hardreg into a global pass

2012-07-31 Thread Bin.Cheng
> > For the other PR you mentioned, that looks like a register allocation > regression, that should be addresses in IRA rather than in regcprop. not sure whether this is a RA regression. Though the two pseudo-regs are connected by reg-move insn and contains same value afterward, the two live range

Re: RFC: extend cprop_hardreg into a global pass

2012-07-31 Thread Bin.Cheng
>> Though the two pseudo-regs >> are connected by reg-move insn and contains same value afterward, >> the two live ranges(i.e. allocnos) are conflict with each other, thus IRA >> cannot allocate same hard register for them. > > If two allocnos have the same value, why can't IRA coalesce them? I don

Question on calculating register pressure before ira pass

2012-08-20 Thread Bin.Cheng
Hi, Currently I am working on improving hoist pass by calculating register pressure and using the info to guide hoist process. It works well and I will send a patch once I finish it. In this work I reused codes in loop-invariant and called ira_set_pseudo_classes function to calculate register pres

Re: Question on calculating register pressure before ira pass

2012-08-24 Thread Bin.Cheng
On Mon, Aug 20, 2012 at 5:00 PM, Bin.Cheng wrote: > Hi, > Currently I am working on improving hoist pass by calculating register > pressure and using the info to guide hoist process. It works well and > I will send a patch once I finish it. > > In this work I reused codes in

Hello, What's the status of live range shrink in GCC

2012-09-10 Thread Bin.Cheng
Hi, I digged into gcc mail archive and found there are several threads discussing about live range shrink, like: http://gcc.gnu.org/ml/gcc/2009-04/msg00248.html and http://gcc.gnu.org/ml/gcc-patches/2009-01/msg00188.html In these messages many people showed interests in LRS, in or out of sched1 pa

Re: Hello, What's the status of live range shrink in GCC

2012-09-10 Thread Bin.Cheng
On Tue, Sep 11, 2012 at 10:41 AM, Vladimir Makarov wrote: > On 12-09-10 6:05 AM, Bin.Cheng wrote: >> >> Hi, >> I digged into gcc mail archive and found there are several threads >> discussing about live range shrink, like: > > As I know Ghassan preferred to wo

Re: Question on documentation about RTL PRE in gccint

2012-10-22 Thread Bin.Cheng
On Mon, Oct 22, 2012 at 6:14 PM, Steven Bosscher wrote: > Bin. Cheng wrote: >> Quoting from GCCINT, section "9.5 RTL passes": >> "When optimizing for size, GCSE is done using Morel-Renvoise Partial >> Redundancy Elimination, with the exception that it does not try to >> move invariants out of loop

Re: Question on updating DF info during code hoisting

2012-10-22 Thread Bin.Cheng
On Mon, Oct 22, 2012 at 6:25 PM, Steven Bosscher wrote: > Bin.Cheng wrote: >> It is possible to have register pressure decreased when hoisting an >> expression up in flow graph because of shrunk live range of input >> register operands. >> To accurately simulating the

Question on load motion in GCSE

2012-10-30 Thread Bin.Cheng
Hi, When doing load motion in GCSE, it depends on different simple memory refers in pre_ldst_table won't clobber(alias to) each other. I am assuming function simple_mem is the answer to this question, but what I don't understand is how simple_mem can make sure of this? Did I understand the load mo

Re: Question on load motion in GCSE

2012-10-30 Thread Bin.Cheng
On Tue, Oct 30, 2012 at 9:47 PM, Bin.Cheng wrote: > Hi, > When doing load motion in GCSE, it depends on different simple memory > refers in pre_ldst_table won't clobber(alias to) each other. > I am assuming function simple_mem is the answer to this question, but > what I don

Question on find_def_preds in tree-ssa-uninit.c

2012-11-14 Thread Bin.Cheng
Hi, In function find_def_preds from tree-ssa-uninit.c there is following code: prev_nc = num_chains; compute_control_dep_chain (cd_root, opnd_edge->src, dep_chains, &num_chains, &cur_chain); /* Free individual chai

Questions on verification function df_lr_verify_solution_start/end

2012-12-09 Thread Bin.Cheng
Hi, When calculating the DF_LR information, GCC uses df_lr_verify_solution_start/end to verify the results. The two functions are called in df_analyze_problem when ENABLE_DF_CHECKING is defined, with each before and after the analysis. What I don't understand are: 1. In df_lr_verify_solution_start,

Too conservative in hardreg propagation pass?

2013-01-07 Thread Bin.Cheng
Hi, In cprop_hardreg pass, function find_oldest_value_reg checks if oldest_regno is in same register class as original one by calling in_hard_reg_set_p. Won't this be too conservative, considering the rewriting of rtx is guarded by validate_change? For example on Thumb1, r12 <-- r0 r1 <-- r1 + r12

Missed ssa-copyrename optimization?

2013-01-07 Thread Bin.Cheng
Hi, For attached preprocessed file, dump file lib_a-s_frexp.E.021t.copyrename1 contains gimple sequences like: : x_41 = x_8(D); goto ; : if (ix_15 <= 1048575) goto ; else goto ; : x_19 = x_8(D) * 1.8014398509481984e+16; gh_u.value = x_19; _21 = gh_u.parts.msw; hx_22

Re: Missed ssa-copyrename optimization?

2013-01-08 Thread Bin.Cheng
On Tue, Jan 8, 2013 at 6:42 PM, Richard Biener wrote: > On Tue, Jan 8, 2013 at 8:51 AM, Bin.Cheng wrote: >> Hi, >> For attached preprocessed file, dump file >> lib_a-s_frexp.E.021t.copyrename1 contains gimple sequences like: >> >> : >> x_41 = x_8(D

Identical basic blocks live long in RTL flow.

2013-01-16 Thread Bin.Cheng
Hi, For below simple function from newlib: static int is_option (char *argv_element, int only) { return ((argv_element == 0) || (argv_element[0] == '-') || (only && argv_element[0] == '+')); } The expanded rtl is like: 9: NOTE_INSN_BASIC_BLOCK 2 2: r113:SI=r0:SI 3: r114:SI=r1

Re: Identical basic blocks live long in RTL flow.

2013-01-17 Thread Bin.Cheng
On Thu, Jan 17, 2013 at 1:29 AM, Jan Hubicka wrote: >> > >> >Basic blocks 8/9/10 are identical and live until pass jump2, which is >> >after register allocation. >> >I think these duplicated BBs do not contain additional information and >> >should be better to be removed ASAP, because they might i

make check stops after one case in asan.exp

2013-01-19 Thread Bin.Cheng
Hi, I ran into a problem that gcc make check stops after one case in asan.exp, here is the dump information: Executed ./sleep-before-dying-1.exe, status 1 = ==30519== ERROR: AddressSanitizer: heap-use-after-free on address 0x41869fc5

Re: make check stops after one case in asan.exp

2013-01-19 Thread Bin.Cheng
On Sat, Jan 19, 2013 at 5:52 PM, Andreas Schwab wrote: > "Bin.Cheng" writes: > >> ERROR: (DejaGnu) proc "lreverse {{ASAN_OPTIONS 0}}" does not exist. > > What is the version of tcl you are using? Perhaps it doesn't know about > lreverse yet (which

Question on lower-subreg.c

2013-01-24 Thread Bin.Cheng
Hi, I read code in lower-subreg.c and found GCC only split some of multi-word mode instructions, like load from memory into pseudo reg, etc. The related code is in find_decomposable_subregs. So for below example from PR56102: double g = 1.0; double func(int a, double d) { if (a > 0) retur

Re: Question on lower-subreg.c

2013-01-25 Thread Bin.Cheng
On Fri, Jan 25, 2013 at 3:57 PM, Bin.Cheng wrote: > Hi, > I read code in lower-subreg.c and found GCC only split some of > multi-word mode instructions, like load from memory into pseudo reg, > etc. The related code is in find_decomposable_subregs. > > So for below ex

Inconsistency between code and document on nop_expr

2013-02-21 Thread Bin.Cheng
Hi, GCCINT says that nop_expr is used to represent conversions that do not require any code generation, while function tree_strip_nop_conversions calls tree_nop_conversion, which returns false even for NOP_EXPR node like "(unsigned int)a", where a has type int. Did I miss something? Thanks -- Bes

Re: Inconsistency between code and document on nop_expr

2013-02-21 Thread Bin.Cheng
On Fri, Feb 22, 2013 at 12:14 PM, Andrew Pinski wrote: > On Thu, Feb 21, 2013 at 7:16 PM, Bin.Cheng wrote: >> Hi, >> GCCINT says that nop_expr is used to represent conversions that do not >> require any code generation, while function tree_strip_nop_conversions >>

Re: Inconsistency between code and document on nop_expr

2013-02-21 Thread Bin.Cheng
On Fri, Feb 22, 2013 at 12:33 PM, Andrew Pinski wrote: > On Thu, Feb 21, 2013 at 8:31 PM, Bin.Cheng wrote: >> On Fri, Feb 22, 2013 at 12:14 PM, Andrew Pinski wrote: >>> On Thu, Feb 21, 2013 at 7:16 PM, Bin.Cheng wrote: >>>> Hi, >>>> GCCINT says that

Question on multiplied address cost computation in ivopt

2013-02-22 Thread Bin.Cheng
Hi, Function get_address_cost in ivopt computes multiplied address cost with below code: First: rat = 1; for (i = 2; i <= MAX_RATIO; i++) if (multiplier_allowed_in_address_p (i, mem_mode, as)) { rat = i; break; } Then: if (rat_p) addr = ge

Re: Question on multiplied address cost computation in ivopt

2013-02-25 Thread Bin.Cheng
On Mon, Feb 25, 2013 at 5:39 PM, Richard Biener wrote: > On Fri, Feb 22, 2013 at 9:42 AM, Bin.Cheng wrote: >> Hi, >> Function get_address_cost in ivopt computes multiplied address cost >> with below code: >> >> First: >> rat = 1; >>

Re: Question on multiplied address cost computation in ivopt

2013-02-25 Thread Bin.Cheng
On Mon, Feb 25, 2013 at 8:51 PM, Richard Biener wrote: > On Mon, Feb 25, 2013 at 11:13 AM, Bin.Cheng wrote: >> On Mon, Feb 25, 2013 at 5:39 PM, Richard Biener >> >> Another question about multiplied address is in function >> multiplier_allowed_in_address_p, it cons

Question about -moutline-atomic under -mcmodel-large

2020-09-17 Thread Bin.Cheng via Gcc
Hi, Compiling below program: #define STREAM_ARRAY_SIZE (1107296256) double a[STREAM_ARRAY_SIZE], b[STREAM_ARRAY_SIZE], c[STREAM_ARRAY_SIZE]; typedef struct { volatile int locked; } spinlock_t; volatile int cnt32=0; volatile long cnt64=0; void atom(){ __atomic_fetch_add(&cnt32,

Re: cache optimization through samping hardware event

2020-11-18 Thread Bin.Cheng via Gcc
On Tue, Nov 10, 2020 at 3:04 PM 172060045 <172060...@hdu.edu.cn> wrote: > > Hi, > > Recently, I was interested in GCC AutoFDO optimization, which works by > sampling specific PMU event on production machines and using those profiles > to guide optimization. In this way, information such as cache

Re: State of AutoFDO in GCC

2021-04-22 Thread Bin.Cheng via Gcc
On Fri, Apr 23, 2021 at 4:16 AM Martin Liška wrote: > > On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote: > > GCC documentation for AutoFDO points to create_gcov tool that converts > > perf.data file into gcov format that can be consumed by gcc with > > -fauto-profile (https://gcc.gnu.org/onli

Question about builtin_free doesn't read memory

2021-11-27 Thread Bin.Cheng via Gcc
Hi, In function ref_maybe_used_by_call_p_1, there is below code snippet /* The following builtins do not read from memory. */ case BUILT_IN_FREE: ... return false; I am confused because free function does read from (and even write to) memory pointed to by pas

Re: Question about builtin_free doesn't read memory

2021-11-28 Thread Bin.Cheng via Gcc
On Sun, Nov 28, 2021 at 4:11 PM Jan Hubicka wrote: > > > Hi, > > In function ref_maybe_used_by_call_p_1, there is below code snippet > > /* The following builtins do not read from memory. */ > > case BUILT_IN_FREE: > > ... > >return false; > > > > I am confu

<    1   2