Thank you for the tips. I tried the following condition for split. "reload_completed && FP_REG_P (operands[0])"
But, the registers are still changed. How can I specify "after register allocation" in the split condition? Thanks, David ----- Original Message ----- > From: "Jeff Law" <l...@redhat.com> > To: "David Kang" <dk...@isi.edu>, gcc@gcc.gnu.org > Sent: Monday, November 3, 2014 11:21:58 AM > Subject: Re: how to keep a hard register across multiple instrutions? > On 10/31/14 16:01, David Kang wrote: > > > > Hi, > > > > I'm newbie in gcc porting. > > > > The architecture that I'm porting gcc has hardware FPU. > > But the compiler has to generate code which builds a FPU instruction > > in a integer register > > at run-time and writes the value to the FPU command register. > > > > To make a single FPU instruction, three instructions are needed. > > Two instructions make the FPU instruction in 32 bit (cmd, > > operands[2], operands[1], operands[0]) format. > > Here operands are the FPU register numbers, which can be 0 ~ 32. > > As an example, f3 = f1 + 2 can be encoded as (code of 'add', 2, 1, > > 3). > > > > And the third instruction write it to a FPU command register. > > The architecture can issue up to 3 instructions at a time. > > > > The difficulty lies in that we need to know the FPU register > > number > > for those operands to generate the FPU instruction. > > > > The easiest but lowest performance implementation is to generate > > those three instruction > > from a single "define_insn" as three consecutive instructions. > > However, we lose all possible bundling of those 3 instructions with > > other instructions for optimization. > > > > So, I'm trying to find a better way. > > I used "define_insn_and_split" and split a single FPU instruction > > into 3 instructions like this: > > (Here I assume to use register r10, but it can be any integer > > register.) > > > > operands[0] = plus (operands[1], operands[2]) > > > > ==> > > > > (1) r10 <- lower half of FPU instruction using > > (code of 'add', operands[0], operands[1], operands[2]) > > > > (2) r10 <- r10 | upper half of FPU instruction using (code of 'add', > > operands[0], operands[1], operands[2]) > > > > (3) (FPU cmd register) <- r10 > > > > > > The problem is that gcc catches that operands[0] is used before > > the 3rd instruction, > > and allocates two different hard registers for (1,2) instructions > > and (3) instruction. > > So, when the code is generated, the first two instructions are > > assuming wrong register > > for operands[0]. > > This happens especially frequently when '-unroll' option is used. > > > > So, I think if there is a way to inform gcc to use the same hard > > registers for > > operands[0] across those three instructions. > > Is it possible? > > > > Or would there be any better way to generate efficient FPU code? > > I will appreciate any advice or pointer to further information. > Use a define_insn_and_split, but only split it after register > allocation > & reloading. > > Jeff -- ---------------------- Dr. Dong-In "David" Kang Computer Scientist USC/ISI