Re: ipa vrp implementation in gcc
On Mon, Jan 18, 2016 at 5:10 PM, Jan Hubicka wrote: >> On Mon, Jan 18, 2016 at 12:00 AM, Kugan >> wrote: >> > Hi, >> > >> >> Another potential use of value ranges is the profile estimation. >> >> http://www.lighterra.com/papers/valuerangeprop/Patterson1995-ValueRangeProp.pdf >> >> It seems to me that we may want to have something that can feed sane loop >> >> bounds for profile estimation as well and we can easily store the known >> >> value ranges to SSA name annotations. >> >> So I think separate local pass to compute value ranges (perhaps with less >> >> accuracy than full blown VRP) is desirable. >> > >> > Thanks for the reference. I am looking at implementing a local pass for >> > VRP. The value range computation in tree-vrp is based on the above >> > reference and uses ASSERT_EXPR insertion (I understand that you posted >> > the reference above for profile estimation). As Richard mentioned in his >> > reply, the local pass should not rely on ASSERT_EXPR insertion. >> > Therefore, do you have any specific algorithm in mind (i.e. Any >> > published paper or reference from book)?. Of course we can tweak the >> > algorithm from the reference above but would like to understand what >> > your intension are. >> >> I have (very incomplete) prototype patches to do a dominator-based >> approach instead (what is refered to downthread as non-iterating approach). >> That's cheaper and is what I'd like to provide as an "utility style" >> interface >> to things liker niter analysis which need range-info based on a specific >> dominator (the loop header for example). > > In general, given that we have existing VRP implementation I would suggest > first implementing the IPA propagation and profile estimation bits using > existing VRP pass and then try to compare the simple dominator based approach > with the VRP we have and see what are the compile time/code quality effects > of both. Based on that we can decide how complex VRP we really want. Hi Honza, These two are not conflict with each other, right? The control-flow sensitive VRP in each function could well inherit IPA analysis results. Thanks, bin > > It will be probably also more fun to implement it this way :) > I plan to collect some data on early VRP and firefox today or tomorrow. > > Honza >> >> Richard. >> >> >> I think the ipa-prop.c probably won't need any siginificant changes. The >> >> code basically analyze what values are passed thorugh the function and >> >> this works for constants as well as for intervals. In fact ipa-cp already >> >> uses the same ipa-prop analysis for >> >> 1) constant propagation >> >> 2) alignment propagation >> >> 3) propagation of known polymorphic call contextes. >> >> >> >> So replacing 1) by value range propagation should be easily doable. >> >> I would also like to replace alignment propagation by bitwise constant >> >> propagation (i.e. propagating what bits are known to be zero and what >> >> bits are known to be one). We already do have bitwise CCP, so we could >> >> deal with this basically in the same way as we deal with value ranges. >> >> >> >> ipa-prop could use bit of clenaing up and modularizing that I hope will >> >> be done next stage1 :) >> > >> > We (Myself and Prathamesh) are interested in working on LTO >> > improvements. Let us have a look at this. >> > >> >>> >> - Once we have the value ranges for parameter/return values, we could >> rely on tree-vrp to use this and do the optimizations >> >>> >> >>> Yep. IPA transform phase should annotate parameter default defs with >> >>> computed ranges. >> >> >> >> Yep, in addition we will end up with known value ranges stored in >> >> aggregates >> >> for that we need better separate representaiton >> >> >> >> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930 >> >>> >> > >> > Thanks, >> > Kugan
Re: Inconsistent initialization for pic_offset_table_rtx?
2016-02-09 19:24 GMT+03:00 Jeff Law : > On 02/09/2016 07:27 AM, Bin.Cheng wrote: >> >> On Fri, Feb 5, 2016 at 10:32 AM, Ilya Enkovich >> wrote: >>> >>> 2016-02-04 19:16 GMT+03:00 Bin.Cheng : On Thu, Feb 4, 2016 at 3:18 PM, Ilya Enkovich wrote: > > 2016-02-04 17:12 GMT+03:00 Bin.Cheng : >> >> Hi, >> I noticed that pic_offset_table_rtx is initialized twice in GCC. Take >> x86_32 as an example. >> The first initialization is done in emit_init_regs, with below code: >> >>pic_offset_table_rtx = NULL_RTX; >>if ((unsigned) PIC_OFFSET_TABLE_REGNUM != INVALID_REGNUM) >> pic_offset_table_rtx = gen_raw_REG (Pmode, >> PIC_OFFSET_TABLE_REGNUM); >> >> On x86_32 with pic, we have: >> >> (gdb) call debug_rtx(this_target_rtl->x_pic_offset_table_rtx) >> (reg:SI 3 bx) >> >> The second initialization is in expand_used_vars, with below code: >> >>if (targetm.use_pseudo_pic_reg ()) >> pic_offset_table_rtx = gen_reg_rtx (Pmode); >> >> On x86_32 with pic, we have: >> >> (gdb) call debug_rtx(this_target_rtl->x_pic_offset_table_rtx) >> (reg:SI 87) >> >> So basically after expanding the first function, pic_offset_table_rtx >> is set to a pseudo register, rather than the one initialized in >> emit_init_regs. >> >> Also this causes inconsistent compilation for the first/rest functions >> in one compilation unit. >> >> A bug? > > > For i386 target PIC_OFFSET_TABLE_REGNUM actually checks > ix86_use_pseudo_pic_reg and is supposed to return INVALID_REGNUM > in case we use pseudo register for PIC. BUT we hit a case when PIC > code is generated for cost estimation via target hooks while performing > some GIMPLE pass. In this case we need to return some register to Thanks IIya. This is exact the case I ran into. See PR69042. > generate PIC usage but we don't have any allocated. In this case we > return a hard register. We detect such situation by checking > pic_offset_table_rtx. > > Thus if we use pseudo PIC register but pic_offset_table_rtx is not > initialized yet, > then PIC_OFFSET_TABLE_REGNUM returns a hard register. > > So I suppose we may consider the first assignment as a bug. But I don't quite follow. So hard register is returned so that gimple passes can construct PIC related addresses? If this is the case, the first initialization is necessary. >>> >>> >>> Right, we need some initialization but current way leads to inconsistent >>> value and we may 'randomly' get hard or pseudo register for different >>> functions >>> which possibly affects some optimizations. We probably need a better >>> way to check if we should return a hard reg for PIC register. Or maybe >>> reset ix86_use_pseudo_pic_reg to NULL when function is finalized or >>> get rid of hard reg at all and always use a pseudo register. >>> Another question is about address cost: if (parts.index && (!REG_P (parts.index) || REGNO (parts.index) >= FIRST_PSEUDO_REGISTER) && (current_pass->type == GIMPLE_PASS || !pic_offset_table_rtx || !REG_P (parts.index) || REGNO (pic_offset_table_rtx) != REGNO (parts.index))) cost++; Is it a bug in the second sub condition? Considering "current_pass->type == GIMPLE_PASS" in the third sub condition, can I assume the second is for non-GIMPLE passes only? >>> >>> >>> There is just a code duplicated for parts.base and parts.index >>> registers. We use same >>> conditions for both of them because you can easily swap them if scale is >>> 1. >>> >>> I don't know why we don't try to recognize PIC registers when in GIMPLE >>> pass. >> >> >> Could we define PIC_OFFSET_TABLE_REGNUM to a pseudo register when >> ix86_use_pseudo_pic_reg returns true, as in this case? Current logic >> doesn't look consistent to me. Of course, I know little about x86. > > I think defining it as a hard reg was a stopgap. I don't think we've got > the level of consistency we want for PIC register as a pseudo throughout the > generic code or hte x86 backend. > > My recollection is that it doesn't appear in the IL as a hard register > anymore. Unfortunately we still use it as EBX. It happens when we try to compute address cost and call target hook to legitimize address for that. It involves pic_offset_table_rtx usage and would ICE if we don't initialize it. Since pic_offset_table_rtx initialization in init_emit_regs happens only if PIC_OFFSET_TABLE_REGNUM holds some hard reg, we use EBX_REG value as workaround. When some function is expanded pic_offset_table_rtx gets a pseudo register which is used for address legitimization during next function optimization. Thus we have a kind of inconsistency using either hard or pseudo register for address cost estimation depending on fu
RE: [Patch] MIPS FDE deletion
> -Original Message- > From: Maciej W. Rozycki [mailto:ma...@imgtec.com] > Sent: Tuesday, January 19, 2016 10:28 AM > To: Moore, Catherine > Cc: binut...@sourceware.org; gcc@gcc.gnu.org; Richard Sandiford > Subject: RE: [Patch] MIPS FDE deletion > > On Mon, 11 Jan 2016, Moore, Catherine wrote: > > > > Does it mean PR target/53276 has been fixed now? What was the > > > commit to add .cfi support for the stubs? > > > > I don't know about the status of PR target/53276. The commit to add > > .cfi support for call stubs was this one: > > > > r184379 | rsandifo | 2012-02-19 08:44:54 -0800 (Sun, 19 Feb 2012) | 7 > > lines > > > > gcc/ > > * config/mips/mips.c (mips16_build_call_stub): Add CFI information > > to stubs with non-sibling calls. > > > > libgcc/ > > * config/mips/mips16.S (CALL_STUB_RET): Add CFI information. > > Thanks. I thought it was someting recent, but this is fairly old. > > I saw your patch handles the `fn_stub' case among others and your test case > included an `__fn_stub_foo' stub too, which is what PR target/53276 is all > about, which is why I thought it may have been resolved and the existence > of the PR accidentally missed. > > BTW, your test case has a stub of the `fn_stub' kind (`__fn_stub_foo') and > one of the `call_fp_stub' kind (`__call_stub_fp_foo'), but none of the > `call_stub' kind (for `foo' it would be called `__call_stub_foo'). The > latter has > AFAICT been addressed by r184379. Was the omission of the test case then > deliberate for some reason (why?) or just accidental? > This is a follow-on patch to fix failures in the GDB MIPS16 thunk tests. I've now augmented the test case to handle the "__call_stub_foo" case. Does this look to commit? Thanks, Catherine fde.cl Description: fde.cl fde.patch Description: fde.patch
gcc-4.9-20160210 is now available
Snapshot gcc-4.9-20160210 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160210/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 233309 You'll find: gcc-4.9-20160210.tar.bz2 Complete GCC MD5=c1e70f600ee67864af2dbd30df78be83 SHA1=ed8d0deeaa3baa52e6e8c5ad6c5819a48e1fd092 Diffs from 4.9-20160203 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.