Re: ipa vrp implementation in gcc

2016-02-10 Thread Bin.Cheng
On Mon, Jan 18, 2016 at 5:10 PM, Jan Hubicka  wrote:
>> On Mon, Jan 18, 2016 at 12:00 AM, Kugan
>>  wrote:
>> > Hi,
>> >
>> >> Another potential use of value ranges is the profile estimation.
>> >> http://www.lighterra.com/papers/valuerangeprop/Patterson1995-ValueRangeProp.pdf
>> >> It seems to me that we may want to have something that can feed sane loop
>> >> bounds for profile estimation as well and we can easily store the known
>> >> value ranges to SSA name annotations.
>> >> So I think separate local pass to compute value ranges (perhaps with less
>> >> accuracy than full blown VRP) is desirable.
>> >
>> > Thanks for the reference. I am looking at implementing a local pass for
>> > VRP. The value range computation in tree-vrp is based on the above
>> > reference and uses ASSERT_EXPR insertion (I understand that you posted
>> > the reference above for profile estimation). As Richard mentioned in his
>> > reply, the local pass should not rely on ASSERT_EXPR insertion.
>> > Therefore, do you have any specific algorithm in mind (i.e. Any
>> > published paper or reference from book)?. Of course we can tweak the
>> > algorithm from the reference above but would like to understand what
>> > your intension are.
>>
>> I have (very incomplete) prototype patches to do a dominator-based
>> approach instead (what is refered to downthread as non-iterating approach).
>> That's cheaper and is what I'd like to provide as an "utility style" 
>> interface
>> to things liker niter analysis which need range-info based on a specific
>> dominator (the loop header for example).
>
> In general, given that we have existing VRP implementation I would suggest
> first implementing the IPA propagation and profile estimation bits using
> existing VRP pass and then try to compare the simple dominator based approach
> with the VRP we have and see what are the compile time/code quality effects
> of both. Based on that we can decide how complex VRP we really want.
Hi Honza,
These two are not conflict with each other, right?  The control-flow
sensitive VRP in each function could well inherit IPA analysis
results.

Thanks,
bin
>
> It will be probably also more fun to implement it this way :)
> I plan to collect some data on early VRP and firefox today or tomorrow.
>
> Honza
>>
>> Richard.
>>
>> >> I think the ipa-prop.c probably won't need any siginificant changes.  The
>> >> code basically analyze what values are passed thorugh the function and
>> >> this works for constants as well as for intervals. In fact ipa-cp already
>> >> uses the same ipa-prop analysis for
>> >>  1) constant propagation
>> >>  2) alignment propagation
>> >>  3) propagation of known polymorphic call contextes.
>> >>
>> >> So replacing 1) by value range propagation should be easily doable.
>> >> I would also like to replace alignment propagation by bitwise constant
>> >> propagation (i.e. propagating what bits are known to be zero and what
>> >> bits are known to be one). We already do have bitwise CCP, so we could
>> >> deal with this basically in the same way as we deal with value ranges.
>> >>
>> >> ipa-prop could use bit of clenaing up and modularizing that I hope will
>> >> be done next stage1 :)
>> >
>> > We (Myself and Prathamesh) are interested in working on LTO
>> > improvements. Let us have a look at this.
>> >
>> >>>
>>  - Once we have the value ranges for parameter/return values, we could
>>  rely on tree-vrp to use this and do the optimizations
>> >>>
>> >>> Yep.  IPA transform phase should annotate parameter default defs with
>> >>> computed ranges.
>> >>
>> >> Yep, in addition we will end up with known value ranges stored in 
>> >> aggregates
>> >> for that we need better separate representaiton
>> >>
>> >> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930
>> >>>
>> >
>> > Thanks,
>> > Kugan


Re: Inconsistent initialization for pic_offset_table_rtx?

2016-02-10 Thread Ilya Enkovich
2016-02-09 19:24 GMT+03:00 Jeff Law :
> On 02/09/2016 07:27 AM, Bin.Cheng wrote:
>>
>> On Fri, Feb 5, 2016 at 10:32 AM, Ilya Enkovich 
>> wrote:
>>>
>>> 2016-02-04 19:16 GMT+03:00 Bin.Cheng :

 On Thu, Feb 4, 2016 at 3:18 PM, Ilya Enkovich 
 wrote:
>
> 2016-02-04 17:12 GMT+03:00 Bin.Cheng :
>>
>> Hi,
>> I noticed that pic_offset_table_rtx is initialized twice in GCC.  Take
>> x86_32 as an example.
>> The first initialization is done in emit_init_regs, with below code:
>>
>>pic_offset_table_rtx = NULL_RTX;
>>if ((unsigned) PIC_OFFSET_TABLE_REGNUM != INVALID_REGNUM)
>>  pic_offset_table_rtx = gen_raw_REG (Pmode,
>> PIC_OFFSET_TABLE_REGNUM);
>>
>> On x86_32 with pic, we have:
>>
>> (gdb) call debug_rtx(this_target_rtl->x_pic_offset_table_rtx)
>> (reg:SI 3 bx)
>>
>> The second initialization is in expand_used_vars, with below code:
>>
>>if (targetm.use_pseudo_pic_reg ())
>>  pic_offset_table_rtx = gen_reg_rtx (Pmode);
>>
>> On x86_32 with pic, we have:
>>
>> (gdb) call debug_rtx(this_target_rtl->x_pic_offset_table_rtx)
>> (reg:SI 87)
>>
>> So basically after expanding the first function, pic_offset_table_rtx
>> is set to a pseudo register, rather than the one initialized in
>> emit_init_regs.
>>
>> Also this causes inconsistent compilation for the first/rest functions
>> in one compilation unit.
>>
>> A bug?
>
>
> For i386 target PIC_OFFSET_TABLE_REGNUM actually checks
> ix86_use_pseudo_pic_reg and is supposed to return INVALID_REGNUM
> in case we use pseudo register for PIC. BUT we hit a case when PIC
> code is generated for cost estimation via target hooks while performing
> some GIMPLE pass. In this case we need to return some register to

 Thanks IIya.  This is exact the case I ran into.  See PR69042.

> generate PIC usage but we don't have any allocated. In this case we
> return a hard register. We detect such situation by checking
> pic_offset_table_rtx.
>
> Thus if we use pseudo PIC register but pic_offset_table_rtx is not
> initialized yet,
> then PIC_OFFSET_TABLE_REGNUM returns a hard register.
>
> So I suppose we may consider the first assignment as a bug.


 But I don't quite follow.  So hard register is returned so that gimple
 passes can construct PIC related addresses?  If this is the case, the
 first initialization is necessary.
>>>
>>>
>>> Right, we need some initialization but current way leads to inconsistent
>>> value and we may 'randomly' get hard or pseudo register for different
>>> functions
>>> which possibly affects some optimizations. We probably need a better
>>> way to check if we should return a hard reg for PIC register. Or maybe
>>> reset ix86_use_pseudo_pic_reg to NULL when function is finalized or
>>> get rid of hard reg at all and always use a pseudo register.
>>>
 Another question is about address cost:

if (parts.index
&& (!REG_P (parts.index) || REGNO (parts.index) >=
 FIRST_PSEUDO_REGISTER)
&& (current_pass->type == GIMPLE_PASS
|| !pic_offset_table_rtx
|| !REG_P (parts.index)
|| REGNO (pic_offset_table_rtx) != REGNO (parts.index)))
  cost++;
 Is it a bug in the second sub condition?  Considering
 "current_pass->type == GIMPLE_PASS" in the third sub condition, can I
 assume the second is for non-GIMPLE passes only?
>>>
>>>
>>> There is just a code duplicated for parts.base and parts.index
>>> registers. We use same
>>> conditions for both of them because you can easily swap them if scale is
>>> 1.
>>>
>>> I don't know why we don't try to recognize PIC registers when in GIMPLE
>>> pass.
>>
>>
>> Could we define PIC_OFFSET_TABLE_REGNUM to a pseudo register when
>> ix86_use_pseudo_pic_reg returns true, as in this case?  Current logic
>> doesn't look consistent to me.  Of course, I know little about x86.
>
> I think defining it as a hard reg was a stopgap.  I don't think we've got
> the level of consistency we want for PIC register as a pseudo throughout the
> generic code or hte x86 backend.
>
> My recollection is that it doesn't appear in the IL as a hard register
> anymore.

Unfortunately we still use it as EBX.  It happens when we try to
compute address cost and call target hook to legitimize address for
that.  It involves pic_offset_table_rtx usage and would ICE if we
don't initialize it.  Since pic_offset_table_rtx initialization in
init_emit_regs happens only if PIC_OFFSET_TABLE_REGNUM holds some hard
reg, we use EBX_REG value as workaround.

When some function is expanded pic_offset_table_rtx gets a pseudo
register which is used for address legitimization during next function
optimization.  Thus we have a kind of inconsistency using either hard
or pseudo register for address cost estimation depending on fu

RE: [Patch] MIPS FDE deletion

2016-02-10 Thread Moore, Catherine


> -Original Message-
> From: Maciej W. Rozycki [mailto:ma...@imgtec.com]
> Sent: Tuesday, January 19, 2016 10:28 AM
> To: Moore, Catherine
> Cc: binut...@sourceware.org; gcc@gcc.gnu.org; Richard Sandiford
> Subject: RE: [Patch] MIPS FDE deletion
> 
> On Mon, 11 Jan 2016, Moore, Catherine wrote:
> 
> > >  Does it mean PR target/53276 has been fixed now?  What was the
> > > commit to add .cfi support for the stubs?
> >
> > I don't know about the status of PR target/53276.  The commit to add
> > .cfi support for call stubs was this one:
> >
> > r184379 | rsandifo | 2012-02-19 08:44:54 -0800 (Sun, 19 Feb 2012) | 7
> > lines
> >
> > gcc/
> > * config/mips/mips.c (mips16_build_call_stub): Add CFI information
> > to stubs with non-sibling calls.
> >
> > libgcc/
> > * config/mips/mips16.S (CALL_STUB_RET): Add CFI information.
> 
>  Thanks.  I thought it was someting recent, but this is fairly old.
> 
>  I saw your patch handles the `fn_stub' case among others and your test case
> included an `__fn_stub_foo' stub too, which is what PR target/53276 is all
> about, which is why I thought it may have been resolved and the existence
> of the PR accidentally missed.
> 
>  BTW, your test case has a stub of the `fn_stub' kind (`__fn_stub_foo') and
> one of the `call_fp_stub' kind (`__call_stub_fp_foo'), but none of the
> `call_stub' kind (for `foo' it would be called `__call_stub_foo').  The 
> latter has
> AFAICT been addressed by r184379.  Was the omission of the test case then
> deliberate for some reason (why?) or just accidental?
> 

This is a follow-on patch to fix failures in the GDB MIPS16 thunk tests.   I've 
now augmented the test case to handle the "__call_stub_foo" case.
Does this look to commit?
Thanks,
Catherine



fde.cl
Description: fde.cl


fde.patch
Description: fde.patch


gcc-4.9-20160210 is now available

2016-02-10 Thread gccadmin
Snapshot gcc-4.9-20160210 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160210/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 233309

You'll find:

 gcc-4.9-20160210.tar.bz2 Complete GCC

  MD5=c1e70f600ee67864af2dbd30df78be83
  SHA1=ed8d0deeaa3baa52e6e8c5ad6c5819a48e1fd092

Diffs from 4.9-20160203 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.