On 22 Mar 2005, Ian Lance Taylor wrote:

> Miles Bader <[EMAIL PROTECTED]> writes:
> 
> > I've defined SECONDARY_*_RELOAD_CLASS (and PREFERRED_* to try to help
> > things along), and am now running into more understandable reload
> > problems:  "unable to find a register to spill in class"  :-/
> > 
> > The problem, as I understand, is that reload doesn't deal with conflicts
> > between secondary and primary reloads -- which are common with my arch
> > because it's an accumulator architecture.
> > 
> > For instance, slightly modifying my previous example:
> > 
> >    Say I've got a mov instruction that only works via an accumulator A,
> >    and a two-operand add instruction.  "r" regclass includes regs A,X,Y,
> >    and "a" regclass only includes reg A.
> > 
> >    mov has constraints like:     0 = "g,a"   1 = "a,gi"
> >    and add3 has constraints:     0 = "a"     1 = "0"    2 = "ri" (say)
> > 
> > So if before reload you've got an instruction like:
> > 
> >    add temp, [sp + 4], [sp + 6]
> > 
> > and v2 and v3 are in memory, it will have to have generate something like:
> > 
> >    mov A, [sp + 4]    ; primary reload 1 in X, with secondary reload 0 A
> >    mov X, A           ;   ""
> >    mov A, [sp + 6]    ; primary reload 2 in A, with no secondary reload
> >    add A, X
> >    mov temp, A
> > 
> > There's really only _one_ register that can be used for many reloads, A.
> 
> I don't think there is any way that reload can cope with this
> directly.  reload would have to get a lot smarter about ordering the
> reloads.
> 
> Since you need the accumulator for so much, one approach you should
> consider is not exposing the accumulator register until after reload.
> You could do this by writing pretty much every insn as a
> define_insn_and_split, with reload_completed as the split condition.
> Then you split into code that uses the accumulator.  Your add
> instruction permits you to add any two general registers, and you
> split into moving one into the accumulator, doing the add, and moving
> the result whereever it should go.  If you then split all the insns
> before the postreload pass, perhaps the generated code won't even be
> too horrible.
> 
> Ian
> 

This approach by itself has obvious problems.

It will generate a lot of redundant moves to/from the accumulator because
the accumulator is exposed much too late.

Consider the 3AC code:

add i,j,k
add k,l,m

it will be broken down into:

mov i,a
add j,a
mov a,k
mov k,a
add l,a
mov a,m

where the third and fourth instructions are basically redundant.

I did a lot of processor architecture research about three years ago, and
I came to some interesting conclusions about accumulator architectures.

Basically, with naive code generation, you will generate 3x as many
instructions for an accumulator machine than for a 3AC machine.

If you have a SWAP instruction so you can swap the accumulator with the
index registers, then you can lower the instruction count penalty to about
2x that of a 3AC machine. If you think about this for a while, the reason
will become readily apparent.

In order to reach this 2x figure, it requires a good understanding of how
the data flows through the accumulator in an accumulator arch.

Toshi




Reply via email to